Article

"Teaching to the Test" in the NCLB Era: How Test Predictability Affects Our Understanding of Student Performance

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

What is “teaching to the test,” and can one detect evidence of this practice in state test scores? This paper unpacks this concept and empirically investigates one variant of it by analyzing test item–level data from three states’ mathematics and reading tests. We show that NCLB-era state tests predictably emphasized some state standards while consistently excluding others; a small number of standards typically accounted for a substantial fraction of test points. We find that students performed better on items testing frequently assessed standards—those that composed a larger fraction of the state test in prior years—which suggests that teachers targeted their instruction towards these predictably tested skills. We conclude by describing general principles that should guide high-stakes test construction if a policy goal is to ensure that test score gains accurately represent gains in student learning.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Au 2007;Berliner 2011), spending more school hours on teaching to the test (e.g. Jennings and Bearak 2014;Ohemeng and McCall-Thomas 2013), and sorting students into groups to maximise the percentage who pass the test at the expense of inclusive and heterogeneous instruction (a practice referred to as 'educational triage;' see Booher -Jennings 2005;Ladd and Lauen 2010). TBA has also been found to increase stress and burnout among teachers who feel frustrated and deprofessionalised when constantly working under external assessments and dictations (e.g. ...
... In line with previous studies (e.g. Berliner 2011;Feniger, Israeli, and Yehuda 2016;Jennings and Bearak 2014), we found many indications of teaching to the test and pedagogical concentration on materials and skills covered by the Meitzav. Since the PLC programme we studied focused on professional development, these topics were not expected to appear in the discussions. ...
... De acuerdo con la literatura sobre la temática (Au, 2007;Cuban, 2007;Jennings y Bearak, 2014;Nichols y Berliner, 2007;Popham, 2001;Volante, 2004), los principales efectos de las pruebas de alto impacto se pueden concretar en los siguientes aspectos: a) el contenido curricular se reduce al desarrollo de las habilidades evaluadas y se dirige hacia los sujetos incluidos en las pruebas; b) el conocimiento del área temática se fragmenta en piezas relacionadas con el test; y c) los docentes aumentan el uso de pedagogías centradas en el docente. ...
... . Teaching to the test: entrenamiento o educación.Teaching to te test se configura como un proceso de aprendizaje que se implementa de forma indirecta en las aula a través de las siguientes fases: 1) alineación de los estándares de evaluación con el currículo, 2) enfatización de los contenidos que previsiblemente serán evaluados, y 3) presentación de lecciones de forma similar a la prueba de evaluación(Jennings y Bearak, 2014). Si hacemos un análisis, la primera etapa de este proceso es válida cuando las instituciones educativas establecen un currículo nacional. ...
Article
Full-text available
Los cambios sociales, económicos y culturales derivados del contexto de globalización actual han propiciado que las nuevas formas de gestión pública desarrollen políticas vinculadas con la rendición de cuentas (RdC). Éstas se configuran como instrumentos de regulación y legitimación de las reformas implementadas e influyen muy directamente en el diseño y la ejecución de los planes de estudios. En este marco, Teaching to the test es una práctica derivada de la presión ejercida por políticas educativas centradas en las evaluaciones de los resultados de aprendizaje. Este artículo pretende analizar los efectos de estas pruebas en los procesos de enseñanza-aprendizaje.
... NCLB-era assessments predictably emphasize some state standards to the exclusion of others (Jennings & Bearak, 2014). In New York, Massachusetts and Texas, which made item-level data available to researchers, students performed better on items tapping standards that were assessed more frequently in previous years than on items reflecting less-frequently assessed standards (Jennings & Bearak, 2014). ...
... NCLB-era assessments predictably emphasize some state standards to the exclusion of others (Jennings & Bearak, 2014). In New York, Massachusetts and Texas, which made item-level data available to researchers, students performed better on items tapping standards that were assessed more frequently in previous years than on items reflecting less-frequently assessed standards (Jennings & Bearak, 2014). This strongly suggests that teachers were targeting their instruction towards these frequently-tested standards, and in so doing, distorting student performance. ...
Preprint
Full-text available
Providing a comprehensive introduction to the topic of accountability and datafication in the governance of education, the World Yearbook of Education 2021 considers global policy dynamics and policy enactment processes. Chapters pay particular attention to the role of international organizations and the private sector in the promotion of performance-based accountability (PBA) in different educational settings and at multiple policy scales. Organized into three sections, chapters cover: the global/local construction of accountability and datafication; global discourse and national translations of performance-based accountability policies; and enactments and effects of accountability and datafication, including controversies and critical issues. With carefully chosen international contributions from around the globe, the World Yearbook of Education 2021 is ideal reading for anyone interested in the future of accountability and datafication in the governance of education.
... Bevan and Hood, 2006;Broadbent and Laughlin, 1998;Smith, 1995). The literature on high-stakes testing (Amrein-Beardsley et al., 2010;Au, 2007;Jennings and Bearak, 2014) finds comparable results, emphasising unintended consequences, including gaming, e.g. cheating when grades are intentionally inflated artificially (Hasselbladh and Bejerot, 2020), and behavioural displacement. ...
... For example, Shirrell (2016) demonstrated inconsistent understandings of accountability and, thereby, different school-level interpretations of policy intentions and their alignment with organisational mission. Jennings and Bearak (2014) argued that teaching-to-the-test could increase testing validity by enabling students to demonstrate their abilities better. Similarly, Van Helden et al. (2012) and Spekl e and Verbeeten (2014) suggested that performance measurement could introduce functional changes irrespective of intentionality. ...
Article
Full-text available
Purpose This paper studies how performance funding of education is perceived by principals, teachers and administrative staff and management. The dysfunctionality of performance measures often reflects how the measures prevent an organisation from achieving its goals. This paper proposes that perceptions of dysfunctionality can be analysed by separating the perceptions of the programme's intentions, of the school-level actions and of the outcomes for students. Design/methodology/approach Following a qualitative methodology, semi-structured interviews were conducted with teachers, school management, staff specialists and top management in a large Danish municipality when outcome-based funding was introduced. Findings The performance-funding programme affected teaching by changing educational priorities. Different perceptions of the (dys)functionality of intentions, actions and outcomes fuelled diverging responses. Although the performance measure was generally considered incomplete, interviewees' perceptions of the financial incentivisation and the dysfunctionality of actions depended on interpretations of the incentivisation and student-related outcomes of the programme. Research limitations/implications Dysfunctionality can be contested; the interpretations of the intention of a performance-funding programme affect the perceived dysfunctionality of reactions. Both technical characteristics of funding schemes and administrators' and principals' mediating roles are essential for the consequences of performance funding. Originality/value The paper examines conditions for dysfunctionality of performance measures. We demonstrate that actions can be perceived as dysfunctional because of a measurement's intentions, actions themselves and the actions' outcomes. Further, the paper illustrates how the reception of performance funding depends on how consequences are enacted based on educators' interpretations of the (dys)functionality of intentions, actions and outcomes.
... Generally, heightening stakes related to test scores seems to positively affect student outcomes, especially in mathematics (Figlio & Ladd, 2015). However, testing is not a complete measure of student learning or ability (Jennings & Bearak, 2014). Much research (e.g., Amrein-Beardsley, 2009;Brill et al., 2018;Deming et al., 2016;De Wolf & Janssens, 2007) has examined negative washback effects, for example, how "[a]ccountability pressures faced by teachers and leaders may lead well-intentioned educators to engage in strategic reporting and operational practices to increase test scores, graduation rates, and other indicators of student success" (Edwards & Mindrila, 2019, p. 3). ...
... Teaching to the test practices include, for instance, "students spend [ing] hours memorizing facts, learning test-taking strategies, bubbling score sheets accurately, eliminating unlikely distractor responses, [and] making educated guesses" (Amrein-Beardsley, 2009, p. 3). However, if students become more comfortable in test-taking situations, teaching to the test can enable students to more correctly demonstrate knowledge and skills (Jennings & Bearak, 2014). ...
Article
Full-text available
High-stakes testing is meant to create a positive washback effect on student learning. Performance funding can raise stakes. However, it is not often used, and its washback is uncertain. The purpose of this paper is to examine performance-funding programs based on students’ exam results. We study principals’ perceptions and interpretations of how this influenced stakes and washback effects of the exit exams. For that purpose, we selected four schools based on theoretical sampling criteria. The empirical data comprise semi-structured interviews with management over the 2-year program and documents describing the performance-funding program. The findings indicate that implementing performance funding increases stakes and has washback effects, but that stakes depend partly on the principal’s choices. Although the consequences were unintended, the program and its effects were mostly perceived as positive. The paper shows how unintended consequences call for careful consideration of the pros and cons of accountability systems when high-stakes test-based funding mechanisms are introduced.
... DE also creates space to expose students to skills and knowledge that might support democratic outcomes but are outside the scope of many standardized K-12 curricula. Due to the nature of K-12 governance, high school curriculum tends to prioritize breadth over depth and teachers face pressure to prepare students for standardized tests (Jennings & Bearak, 2014;Parker et al., 2011). State and local politics further shape high school curriculum, and-especially in recent years-have resulted in the censorship of books and curricular content that political and religious conservatives deem threatening (McClure, 2022), from evolution in biology class to slavery in history class (Berkman & Plutzer, 2012). ...
Article
Full-text available
Dual enrollment (DE) is a popular reform in the United States that allows high school students to take college courses through partnerships between school districts and institutions of higher education. DE programs have been scaling rapidly, but participation is stratified by race and class, and research reveals little about the quality and content of DE courses. These limitations stem, in part, from a lack of theorizing around what purpose DE reform can and should serve, both in the lives of youth and for communities writ large. Situated in literature on the purpose of education in capitalist democracies, this study employs qualitative content analysis to examine the rationales for DE coursework, as depicted in state-level policy documents. Findings indicate that DE policy rationales are depicted almost entirely in neoliberal economic terms. We argue that, while economic benefits are important, the almost exclusive emphasis on economic outcomes has led to rapid scaling of a curricular reform with insufficient attention to teaching, learning, and equity. To maximize the potential benefits of DE reform, we call for imagining its democratic possibilities.
... Still, critics of educational accountability policies have cited a variety of unintended negative consequences within the educational system that exacerbate inequities in education (Emler et al., 2019;Lane et al., 1998). Accountability assessments have led to a narrowing of the curriculum with an emphasis on skills (Jennings & Bearak, 2014;Lane, 2020), increased pressure on educators (Ro, 2019), a lack of educator autonomy (Kavanagh & Fisher-Ari, 2020), an overemphasis on test preparation (Sonnert et al., 2019), the inappropriate use of test scores (Tavassolie & Winsler, 2019), questionable practices in reassignment of teachers or principals (Lane, 2020;Martin, 2012), tracking of students (Giersch, 2018;Lane & Stone, 2002), penalizing teachers and schools (Arcia et al., 2011;Nichols & Harris, 2016), and encouraging cheating (Aronson et al., 2016). These results potentially have differential impact on historically excluded students. ...
Article
Full-text available
Recent criticisms of large‐scale summative assessments have claimed that the assessments are biased against historically excluded groups because of the assessments' lack of cultural representation. Accompanying these criticisms is a call for more culturally responsive assessments—assessments that take into account the background characteristics of the students; their beliefs, values, and ethics; their lived experiences; and everything that affects how they learn and behave and communicate. In this paper, we present provisional principles, based on a review of research, that we deem necessary for fostering cultural responsiveness in assessment. We believe the application of these principles can address the criticisms of current assessments.
... Optimizing purely for a measurable learning outcome could lead to farcical interventions such as simplifying learning content or making assessments easier. Less obvious could be an intervention that improves a student's examination skills, possibly improving learning outcomes without improving learning, a phenomenon known as "teaching to the test" (Jennings & Bearak, 2014). ...
Article
Full-text available
To promote cross-community dialogue on matters of significance within the field of learning analytics (LA), we as editors-in-chief of the Journal of Learning Analytics (JLA) have introduced a section for papers that are open to peer commentary. An invitation to submit proposals for commentaries on the paper was released, and 12 of these proposals were accepted. The 26 authors of the accepted commentaries are based in Europe, North America, and Australia. They range in experience from PhD students and early-career researchers to some of the longest-standing, most senior members of the learning analytics community.
... Although the reliance on external accountability tactics can cause short-term gains in standardized test scores, a lack of meaningful organizational change leads to the unsustainability of any short-term gains in student achievement (Fullan, 2015;Meyers & Smylie, 2017). Evidence suggests that the focus on increasing test scores alone can hinder a more holistic focus on inputs that augment student learning (Jacob, 2017;Jennings & Bearak, 2014). Overdependence on performance metrics can undercut the development of mastery goals linked to higher levels of collective efficacy (Ciani et al., 2008). ...
Article
Full-text available
This quantitative study aims to explore the validity of Donohoo et al.’s (2020) Enabling Conditions for Collective Teacher Efficacy Scale (EC-CTES) for fostering collective efficacy in schools and evaluate its relationship to measures of collective teacher efficacy. The instruments used for this study include the EC-CTES, the Collective Efficacy Scale (CES-SF), and the Collective Teacher Beliefs Scale (CTBS). The data were evaluated through confirmatory factor analysis, correlation matrices, and multiple regression models. The findings from this study demonstrate that the EC-CTES is a valid tool. The EC-CTES subscales are positively associated with measures of collective teacher efficacy. We recommend adjustments for the EC-CTES subscales for greater congruence with collective efficacy theory and practical application. Due to the theoretical density of collective teacher efficacy, a modified conceptual framework is proposed to make the enabling conditions theory more accessible to practitioners.
... Further, the negative effects of interventions designed to quickly meet accountability targets are well documented (see Balfanz et al., 2007;Dee et al., 2013;Jennings & Bearak, 2014). As an intervention implemented to respond to federal accountability pressure to increase high school graduation rates, OCR has the potential to introduce unintended negative side-effects like lower test scores. ...
Article
Full-text available
Prior to the COVID-19 pandemic, online credit recovery (OCR) was the most popular use of distance learning in high schools in the United States. With high course failure rates during the height of the COVID-19 pandemic, high schools have turned to OCR to help students recover lost credit. This study examined the potential consequences of increasing OCR enrollment at the school level using administrative data from North Carolina and found that increasing OCR enrollment is associated with higher rates of passing previously failed courses but with diminishing returns after about three-quarters of students who failed courses enrolling in OCR. Consistent OCR enrollment increases over four years is associated with higher graduation rates. Contrary to prior research, this study finds no evidence that school-level OCR enrollment increases are associated with lower test score proficiency rates. Using pre-pandemic data to help inform post-pandemic decision making, the results suggest that increasing OCR enrollment might address increased pandemic-induced course failure rates by expanding opportunities to re-earn course credit, but this would not necessarily translate to higher graduation rates.
... For both groups, however, it is likely there is something larger at play. Academic success is focused on obtaining a high GPA as the school context, teachers, parents, and peers put emphasis on high test scores, homework, and "teach to the test" type of education (Obama, n.d.;Jennings & Bearak, 2014). This is not to blame teachers, parents, or youth, but instead to recognize that the systems our students learn in reflect what is meant by "success" and how students achieve this success academically (McGuinn, 2012). ...
Article
Full-text available
Background: This longitudinal study examined growth trajectories of academic motivation in youth with and without attention-deficit/hyperactivity disorder (ADHD) across the important developmental transition from middle school to high school, and associations with academic success. Consistent with self-determination theory (SDT) of motivation, trajectories of amotivation, extrinsic motivation, and intrinsic motivation were modeled. Methods: The study included a robust multi-method, multi-source assessment of academic outcomes, including homework performance ratings; reading and mathematics standardized test scores; and grade point average (GPA) obtained from school records. Participants included 302 adolescents (ages 12-14; Mage = 13.20) in eighth grade who were specifically recruited so that approximately half (n = 162) were diagnosed with ADHD and 140 adolescents comprising a comparison sample without ADHD. The sample was predominantly White (81.80%), with 7.90% identifying as bi/multiracial, 5.30% identifying as Black/African American, 4.60% identifying as Asian, and 0.30% identifying as Indigenous/Alaskan. Results: Adolescents with ADHD had worse academic motivation at all timepoints. Growth curve analyses indicated the academic motivation of adolescents without ADHD decreased at faster rates across the transition to high school compared to adolescents with ADHD. However, for adolescents with ADHD, amotivation, extrinsic motivation, and intrinsic motivation each predicted GPA, with higher extrinsic and intrinsic motivation also predicting better homework performance and different aspects of math performance, whereas for youth without ADHD, only amotivation and extrinsic motivation predicted GPA. Conclusions: Intervention and school policy implications are discussed, including the importance of fostering autonomy and internal motivation, and consideration of whether current ADHD interventions primarily foster extrinsic motivation.
... However, NCLB's emphasis on proficiency rates also created perverse incentives for schools to focus on the so-called "bubble" students to the detriment of students throughout the achievement distribution or to push out students expected to score below proficient. In particular, improvements in student achievement and decreased achievement gaps were concentrated among students near the cusp of the test proficiency threshold-suggesting districts may have triaged supports to the students most likely to increase proficiency rates (Balfanz et al., 2007;Booher-Jennings, 2005;Cullen & Reback, 2006;Darling-Hammond, 2006;Ho, 2008;Jacob, 2005;Jennings & Bearak, 2014;Krieg, 2008;Reback, 2008;. ...
Article
The recent Every Student Succeeds Act (ESSA) requires states to identify and turn around their lowest performing schools, but it breaks somewhat from prior policies by granting states significant autonomy over how they identify and turn around these schools. This mixed-methods study, which draws on administrative, qualitative, and survey data, examines the effectiveness of Michigan’s approach to school turnaround under ESSA. We find that students in turnaround schools experienced significant achievement gains in math and to a lesser extent in English language arts (ELA), with effects concentrated among the lowest achieving students. Analyses of qualitative and survey data suggest that these outcomes were influenced by state-level supports, strategic planning, the threat of accountability for continued low performance, and improved leadership quality in turnaround schools.
... This intention is related to a perceived disconnect between school learning and the value of learning outside of a school context. The contexts that teachers use for teaching in school are often academic (Giamellaro, 2017), in the sense that teachers 'teach to the test' to prepare students for academic achievement (Jennings & Bearak, 2014). Thus, although the research was conducted in the context of COVID-19, the results do not show that the use of outdoors was mainly prompted by the need to have space between students to reduce the spread of a deadly virus but still allow them to learn. ...
Article
Although the school curriculum of the province of Québec, Canada, does not explicitly encourage teachers to provide outdoor learning experiences, it appears that there is a growing momentum for outdoor education. Thus, the research question that guided this study was: What are preschool, elementary, and secondary teachers’ outdoor education practices in the province of Québec, Canada? To answer this research question, we conducted a survey that collected quantitative data to describe teachers’ intentions, the places they use, the learning that is targeted, and the challenges. We used descriptive statistics to analyze the data. Our results show that school-based outdoor education can realize complementary learning intentions that cannot be met by using only classrooms, that outdoor education can be practiced in a variety of places, regardless of the settings in a school’s immediate surroundings, and that school-based outdoor education has the potential to decrease sedentary behaviours and increase students’ levels of physical activity.
... This outcome is notable considering that the majority of phenotypes result from the interaction of genes in the environment (1-3, 5, 6, 54), and there is widespread student interest in these more complex traits (18,19). Furthermore, given that assessment is a key indicator of instruction and curricular content (42,44,55) and curriculum standards from elementary school through undergraduate include multifactorial concepts, such as the intersection of genetics with the environment (36,37,39), there are several opportunities to expand assessment question content. ...
Article
Full-text available
Undergraduate genetics courses have historically focused on simple genetic models, rather than taking a more multifactorial approach where students explore how traits are influenced by a combination of genes, the environment, and gene-by-environment interactions. While a focus on simple genetic models can provide straightforward examples to promote student learning, they do not match the current scientific understanding and can result in deterministic thinking among students. In addition, undergraduates are often interested in complex human traits that are influenced by the environment, and national curriculum standards include learning objectives that focus on multifactorial concepts. This research aims to discover to what extent multifactorial genetics is currently being assessed in undergraduate genetics courses. To address this, we analyzed over 1,000 assessment questions from a commonly used undergraduate genetics textbook; published concept assessments; and open-source, peer-reviewed curriculum materials. Our findings show that current genetics assessment questions overwhelmingly emphasize the impact of genes on phenotypes and that the effect of the environment is rarely addressed. These results indicate a need for the inclusion of more multifactorial genetics concepts, and we suggest ways to introduce them into undergraduate courses.
... A large part of the educational reforms that introduce market mechanisms are accompanied by systems of accountability and information systems for families, largely based on the results obtained by schools in external evaluations. Numerous strategies oriented to improve the students' performance in standardized external tests can fall under the category of teaching to the test practices (Jennings & Bearak, 2014). These types of practices include making pedagogical changes, narrowing the curriculum (to focus on those contents evaluated in standardized tests), setting objectives for each student, monitoring or increasing the diversity of programs offered to adapt to the specific needs of students (integration or language programs), and specific sessions oriented toward preparation for external exams Woods et al., 1998). ...
Article
List of papers: Introduction 1. Sociological Contributions to School Choice Policy and Politics Around the Globe: Introduction to the 2020 PEA Yearbook Amanda U. Potterton, D. Brent Edwards Jr., Ee-Seul Yoon, and Jeanne M. Powers Section I: The Strategies and Responses of Schools and 2. Families to School Choice Policies School Counselors’ Assessment of the Legitimacy of High School Choice Policy Carolyn Sattin-Bajaj and Jennifer L. Jennings 3. Schools in the Marketplace: Analysis of School Supply Responses in the Chilean 43 Education Market Adrián Zancajo 4. Opting for Private Education: Public Subsidy Programs and School Choice in Disadvantaged Contexts Mauro Carlos Moschetti and Antoni Verger 5. The Development and Dynamics of Public–Private Partnerships in the Philippines’ Education: A Counterintuitive Case of School Choice, Competition, and Privatization Andreu Termes, D. Brent Edwards Jr., and Antoni Verger Section II: Sociology of School Choice Politics and Education Markets 6. Media Strategies in Policy Advocacy: Tracing the Justifications for Indiana’s School Choice Reforms Joel R. Malin, Christopher Lubienski, and Queenstar Mensa-Bonsu 7. Ideas and the Politics of School Choice Policy: Portfolio Management in Philadelphia Rand Quinn and Laura Ogburn 8. Parental Accountability, School Choice, and the Invisible Hand of the Market Amanda U. Potterton 9. School Choice Research and Politics with Pierre Bourdieu: New Possibilities 193 Ee-Seul Yoon Section III: Conflict and Competition for Resources in Organizational and Regulatory Contexts 10. Teacher Power and the Politics of Union Organizing in the Charter Sector Huriya Jabbar, Jesse Chanin, Jamie Haynes, and Sara Slaughter 11. Rearranging the Chairs on the Deck or True Reform? Private Sector Bureaucracies in the Age of Choice—An Analysis of Autonomy and Control Sarah Butler Jessen and Catherine DiMartino Commentary 12 Toward a Global Political Sociology of School Choice Policies Bob Lingard
... ESSA (2015) also had a significant impact on how teachers developed and implemented their curriculum, and research indicates it had its own set of unintended consequences (see Poiner, 2016). To date, much research has reported on the impact these policies have had on student academic performance 4 (e.g., systemic negative consequences for EL students as reported by Menken, 2010;stagnation for top performers associated with high-stakes testing according to Loveless et al., 2008; longitudinal assessment student performance under both NCLB and ESSA in Hemet & Jacob, 2017) or teacher performance (such as when Jennings & Bearak, 2014 discuss teaching to the test and artificial grade inflation) or teacher quality (e.g., Porter-Magee, 2004;Schoen & Fusarelli, 2008). However, few studies have explored the teacher perspective of living through the curriculum demands of both policies and exploring the changing effect it had across time on their classrooms compared to what media was portraying. ...
... Research on the effects of systemic reform policies, such as NCLB and CCSS, often investigates whether these policies work by focusing on student outcomes typically measured in terms of student achievement (Dee & Jacob, 2011;Wei, 2012;Wong et al., 2018). Other studies focus on the effects on classroom instruction, such as narrowing the curriculum to tested subjects or attention to particular students (Au, 2007;Jennings & Bearak, 2014;Polikoff, 2012). Another policy effect, receiving limited attention until recently, has centered on educational system (re)building: that is, school districts' efforts at reorganizing around their core educational function-instruction (Austin et al., 2006;Cohen et al., 2013;Hopkins et al., 2013;Johnson et al., 2014;Marsh et al., 2005;Weast, 2014). ...
Article
This article examines how leaders in public, private, and hybrid educational systems manage competing pressures in their institutional environments. Across all systems, leaders responded to system-specific puzzles by (re)building systemwide educational infrastructures to support instructional coherence and framed these efforts as rooted in concerns about pragmatic organizational legitimacy. These efforts surfaced several challenges related to educational equity; leaders framed their responses to these challenges as tied to both pragmatic and moral organizational legitimacy. To address these challenges, leaders turned to an array of disparate government and nongovernment organizations in their institutional environments to procure and coordinate essential resources. Thus, the press for instructional coherence reinforced their reliance on an incoherent institutional environment.
... For example, standardization of the content and assessment across schools was intended to ensure that all children were held to equally high standards. Sanctions associated with poor performance, however, motivated school staff to ''hide'' low-performing students in nontested categorizations or through absence (Booher-Jennings, 2005;Chakrabarti & Schwartz, 2013;Figlio, 2006;Price, 2010), to focus primarily on ''bubble'' students likely to meet the standards with minimal additional support (Booher-Jennings, 2005;Jennings & Sohn, 2014), and to modify instruction to a narrow set of tested standards (Dee et al., 2013;Jennings & Bearak, 2014;Jennings & Sohn, 2014;Judson, 2013). At the extreme, several districts and schools were found guilty of facilitating cheating on assessments (Jacob & Levitt, 2003;Koretz, 1996). ...
Article
Full-text available
The continuous improvement (CI) approach to systems change has rapidly spread across education policy circles in recent years and has been hailed as a promising means to achieve educational equity and social justice. CI's highly routinized, scientific process for improving efficiency and productivity is a somewhat unexpected means to pursue equity. To understand this puzzle, I examine the use of CI to promote equity through two qualitative, multilevel case studies. I draw on institutional theory to understand how CI has integrated logics of racial equity and performance, and how local actors have improvised novel approaches. This analysis illuminates the complex institutional dynamics at play with CI implementation and identifies the challenges and promise of using CI to promote educational equity.
... Scholars raise questions, however, about the extent to which these policies have advanced students' educational experiences and outcomes (Cohen and Mehta, 2017;Cohen et al., 2007;Jennings, 2012;Mehta, 2013). Some districts and schools, for instance, responded to these policies by narrowing instructional practice and teaching to the test (Cuban, 2013;Desimone et al., 2007;Jennings and Bearak, 2014;Smith and Kovacs, 2011), while other districts engaged ceremoniously (but not substantively) in reform practices (Yurkofsky, 2017). These policies have also been associated with the disempowerment and stigmatization of minority groups (Lane et al., 2019;Trujillo and Renée, 2012). ...
Article
Full-text available
Historically, teachers had been delegated the primary responsibility for the organization and management of classroom instruction in US public schools. While this delegation afforded teachers professional autonomy in their work, it has also resulted in disparities in students’ educational experiences and outcomes within and between classrooms, schools, and systems. In the effort to improve instruction and reduce disparities for students on a large scale, one reform effort in the US has focused on building instructionally focused education systems (IFESs) where central office and school leaders collaborate with teachers to organize and manage instruction. These efforts are playing out in a variety of contexts in the US, including in public school districts, non-profits, and other educational networks, and it is shifting how teachers carry out the day-to-day work of instruction. In this comparative case study, we investigate two IFESs in which efforts to improve instruction pushed against historic norms of teacher autonomy. We found that these new systems are not at odds with teacher autonomy, but rather these systems reflect a transition to more interdependent notions of teacher autonomy.
... Kippers, Wolterinck, Schildkamp, Poortman, and Visscher (2018) found that although secondary school teachers in the Netherlands use various kinds of classroom assessments for information on student achievement, the systematic analysis of these data to support teaching has not been integrated into their practice. There is also some evidence to suggest that teachers engage in inappropriate test preparation practices in response to outcome data (e.g., teaching to the test) (Jennings & Bearak, 2014). Additionally, Mausethagen, Prøitz, and Skedsmo (2018) work demonstrated that data use by Norwegian secondary school teachers usually aims to address short-term teaching goals and improve test results. ...
... Hvis eleverne bliver mere komfortable med eksamenssituationen, vil de vaere bedre i stand til at demonstrere opfyldelsen af laeringsmålene, end når resultaterne påvirkes af elevernes nervøsitet (Jennings & Bearak 2014, s. 382). Derfor argumenterer Jennings & Bearak (2014) for, at konsekvenserne af teaching-to-the-test bør evalueres som et spektrum med flere muligheder fremfor den dikotomi, der oftest anvendes: Enten anvender laererne teaching-to-the-test eller ej; og enten er det godt eller skidt. ...
Article
Full-text available
Med henblik på at reducere andelen af lavt praesterende elever ved afgangseksamenerne i folkeskolen vedtog Folketinget i 2016 den såkaldte skolepulje. Formålet var at give udvalgte skoler et økonomisk incitament til at reducere andelen af lavtpraesterende elever. Denne artikel fokuserer på, hvordan skoleledere opfattede skolepuljen og de initiativer, der blev taget som følge heraf. Vi fokuserer isaer på, om der forekom negative tilbageløbseffekter (washback), dvs. uønskede påvirkninger af undervisning eller elever. Artiklen er baseret på semistrukturerede interviews gennemført med skoleledere. I analysen anvendes meningsskabelse (sensemaking) som en teoretisk linse til at undersøge, hvordan ledere skaber nye meninger, når nye initiativer igangsaettes som konsekvens af skolepuljen. Resultaterne indikerer, at puljens resultatbaserede finansiering påvirkede undervisningspraksis på samme måde som tilbageløbseffekterne fra standardiserede elevtests. Dog blev skolepuljen og dens virkninger generelt oplevet som positive. Undersøgelsen viser desuden, at skoleledere spiller en vigtig rolle for, hvordan skoler reagerer på resultatbaseret finansiering. Emneord Folkeskolen, resultatbaseret styring, resultatbudgettering, sensemaking, skolepuljen, teaching-to-the-test, tilbageløbseffekter, undervisning, økonomistyring.
... Examples of both types of behaviour have been reported in various studies about testbased accountability and school inspections in the US and Europe. In the US, Hamilton, Stecher and Klein (2002), Koretz (2002Koretz ( , 2003, , Stecher et al. (2006), Holcombe, Jennings and Koretz (2012), Jennings and Bearak (2014), and others, for example, report of score inflation when teachers teach to the test, schools narrow their curricula or reallocate their resources to tested subjects. In Europe, De Wolf and Janssens (2007), Ehren (2016) and Jones et al. (2017) talk about 'teaching to inspection', 'window dressing', 'ossification' and 'tunnel vision' as ways in which inspections and inspection frameworks unintentionally shape a school's organization and classroom teaching. ...
Chapter
Full-text available
This chapter will first explain what we mean by ‘educational accountability’ and how vari�ous types of accountability systems have stand�ardized aspects of a school’s organization. Campbell’s law and the work of others (particu�larly Espeland and Sauder) are used to explain the mechanisms of self-fulfilling prophecy and commensuration, through which accountability measures potentially distort our understanding of school organization. The conclusion of this chapter draws on Creemers and Kyriakides’ (2015) dynamic model of educational effective�ness to inform a more holistic approach in hold�ing schools to account, which allows for more variation in how schools organize their teaching and learning.
... While some scholars found that cross-international policy differences shape the types of problems educators focus on during data analyses (Schildkamp et al., 2017), others focused on the ways in which test-based accountability policies in the United States affect responses to data. Some documented "distortive" responses to data, such as adopting practices to raise test scores (e.g., focusing on "bubble kids") rather than genuine learning (Booher-Jennings, 2005;Jennings & Bearak, 2014;Hamilton et al., 2007). Others found that such policies can reinforce biases in data use processes rather than motivate deeper reflection on improving practice (e.g., Brette et al., 2017). ...
Article
Full-text available
Background/Context Researchers have amassed considerable evidence on the use of student performance data (e.g., benchmark and standardized state tests) to inform educational improvement, but few have examined the use of nonacademic indicators (e.g., indicators of social and emotional well-being) available to educators, and whether the factors shaping academic data use remain true for these newer types of data. While the field continues to advocate for greater attention to the social–emotional development of students, there remains little guidance on conditions supporting the use of data on these important mindsets, dispositions, beliefs, and behaviors. Purpose/Focus of the Study In this article, we use sensemaking theory, prior research on academic data use, and research from a study of “early adopter” California districts to develop a framework for understanding conditions likely to shape educators’ use of social–emotional learning (SEL) indicators to inform practice. Research Design We develop our findings and framework by drawing on prior research and theory, as well as data from a multiyear research–practice partnership with a consortium of California districts that began measuring SEL as part of the No Child Left Behind waiver they received from the U.S. Department of Education. We draw on more than 125 interviews with consortium leaders, central office administrators, leaders, teachers, and staff in 25 schools and six districts to understand how they made sense of SEL and SEL survey data, as well as the practices employed to support SEL. Findings We find that five categories of conditions appear to shape how educators interpret and respond to SEL indicators: policy context, organizational conditions, interpersonal relationships and interactions, data user characteristics, and data properties. Much like academic data use, we find: (1) the accountability policy context can convey a sense of importance, but may also lead to distortive responses; (2) district and school leaders are critical for allocating time and staff, and cultivating a data culture; (3) collaboration facilitates sensemaking; (4) individual-level knowledge and beliefs can shape interpretation; and (5) timeliness and perceived relevance of data matter. Some of these conditions, however, are uniquely relevant to the use of SEL data, which brings greater ambiguity, uncertainty, and a decoupling from the traditional academic role of educators. We find that including SEL indicators in multiple measure systems can lead to uncertainty and interpretive complexity, and divide educators’ attention. Deficit conceptions may also shape sensemaking and are especially germane in the SEL context given documented gaps by race/ethnicity on measures of SEL. Another condition especially relevant to SEL indicator usage is the lack of coherence or clarity around SEL. The frequent misunderstandings of and disagreement about SEL—sometimes shaped by disciplinary background—could lead to different interpretations and responses. All of these conditions suggest that sensemaking and response to SEL data indicators are complex processes that require multiple enabling factors. Conclusions and Implications Given the significant investments in supporting and measuring student social-emotional development, it behooves policymakers, education leaders and practitioners to better understand the conditions facilitating and inhibiting productive use of SEL indicators. The framework provided herein presents a set of concepts and conditions that may be useful in supporting this process. The findings also raise a cautionary flag that while sometimes consistent with the process of using academic data, the use of SEL indicators may present added challenges worthy of attention. We conclude with implications for policy, practice, and research. Notably, education leaders and practitioners may want to invest in building common understanding of SEL and capacity to interpret and act on these indicators, and consider how equity orientations shape understanding and usage of SEL indicators. Policymakers may want to consider more formative uses of SEL data that are provided to educators earlier in the year, and attend to the human capital needs that accompany SEL data usage. Finally, researchers might build on this work by further examining the relationship between SEL and culture/climate and the ways in which educators respond to data on both, and also investigate the outcomes of SEL data usage, such as actions that lead to meaningful improvements in SEL.
... Nonetheless, detrimental consequences have also been reported as a result of SBR. Increased teaching-to-the-test (Amrein & Berliner, 2002;Jennings & Bearak, 2014) and a narrowing of the curriculum (Berliner, 2011;Christenson, Decker, Triezenberg, Ysseldyke, & Reschly, 2007) are significant negative consequences of SBR for all students. Scholars have also documented increased dropout rates for students with disabilities (Cole, 2006;Lillard & DeCicca, 2001) and lower graduation rates, particularly in states that link successful passage of exit exams to diploma requirements (Gaumer-Erickson, Kleinhammer-Tramill, & Thurlow, 2007). ...
Article
Full-text available
The continuously evolving standards-based reform (SBR) movement is one of the most prominent features of today's educational policy landscape. As SBR has continued to drive educational policy, local schools and districts have adopted many approaches to comply with legal mandates. This article critically examines one particular resultant phenomenon of the SBR movement—the emergence of a new track of self-contained classes called Prioritized Curriculum classes, designed to provide students with disabilities access to standards-based general education curriculum, but in a segregated class. In this article we document the emergence of such courses and critically analyze the rationales and policy loopholes that have led to their creation.
... The MET study in and of itself was designed to observe and identify "measures of effective teaching" to promote standardization of instructional approaches that yield higher student academic achievement outcomes (White & Rowan, 2018). Arguably, the dominance of the Task-Focused classroom reflects the teaching to the test learning climate (Jennings & Bearak, 2014) that had taken hold in the United States by 2009 (the first year of the MET data study) since the passage of No Child Left Behind in 2001 (U.S. Congress, 2002). ...
Article
Inspired by a “whole child” framing, the current study takes a “whole classroom” perspective to consider classroom practice. Study aims included: (1) presenting a systematic video-based observational coding strategy to concurrently consider practice domains that have implications for learning—cognitive instruction, classroom management, and teacher–student relational interactions; (2) identifying distinct and interrelated classroom typologies based upon this coding strategy. The framework was developed through coding and analysis of 58 purposively sampled urban 4th–9th grade classrooms from the Measures of Effective Teaching study. Analyses revealed three overarching typologies: task-focused (52%), low stimulation (43%), and optimal (5%). We conclude by discussing implications for urban education.
... Kippers, Wolterinck, Schildkamp, Poortman, and Visscher (2018) found that although teachers in the Netherlands use various kinds of classroom assessments for information on student achievement, the systematic analysis of these data to support teaching has not been integrated into their practice. There is also some evidence to suggest that teachers engage in inappropriate test preparation practices in response to outcome data (e.g., teaching to the test) (Amoako, 2019;Jennings & Bearak, 2014). Further, based on their interviews, Gelderblom et al. (2016) concluded that although teachers claim to be aware of the importance of data use and consider themselves to use data to a great extent, their use of data for instructional purposes often go amiss. ...
Article
Assessment results can be a guide to instruction, and they can ensure that the prescribed curriculum is well covered. When assessment data are used as a means of making appropriate instructional adjustments for improvement, teaching and learning progresses. The study examined basic school teachers’ perception and use of assessment data. Cross-sectional survey design was used for the conduct of the study. Hundred and fifty (150) teachers within the Central region of Ghana were sampled from twenty (20) basic schools using systematic sampling procedure. A two-dimensional questionnaire was adapted, validated and used for the collection of research data. The data to provide answers to the study question were analysed using descriptive statistics, specifically, percentages and frequencies. The hypothesis was tested using Partial Least Square structural equation modelling approach. Findings revealed that in practice, basic school teachers use assessment data to plan instruction, evaluate students’ learning progress, determine curriculum strands to emphasize during teaching sessions and also to evaluate instructional effectiveness for the academic year. The study further showed that teacher perception about assessment significantly predict assessment data use. The study recommends that, tertiary institutions that train teachers must continue to place much emphasis on the teaching of ‘assessment in schools’ to deepen prospective teachers’ knowledge and utilization of assessment data for sustenance of positive ‘assessment data use practices’ in Ghana basic schools.
... Fourth, teachers' ability to act on their sense of responsibility is supported in initiatives where they can play a meaningful role in assessing and monitoring change in relation to their own efforts. For example, in contrast to more distant, managerial approaches to monitoring outcomes (e.g., using large-scale assessments; Jennings & Bearak, 2014;Knoester & Au, 2017;Koretz, 2017;Von der Embse et al., 2016), when data are generated with/by teachers in ways that provide rich, situated information, then teachers can use that information to make decisions about how to adjust their teaching and then monitor student learning outcomes (Butler & Schnellert, 2020;Lauermann & Karabenick, 2011). In these conditions, teachers build from and strengthen their self-efficacy, when they can see meaningful and valued changes associated with their efforts, which also then fuels further investment in change initiatives (Butler & Schnellert, 2012, 2020Butler et al., 2015). ...
Article
Full-text available
This research examined how stakeholders (n = 40) from one school district experienced “accountability” within a context where responsibility for student learning was being distributed across the system. Using a case study design, we examined: what conditions supported stakeholders in multiple roles to exercise responsibility for student learning? Analyses of documents and interviews revealed conditions that enabled teachers, instructional leaders, and administrators to share responsibility in relation to their roles, and empowered teachers to engage in inquiry for continuous improvement and build from their sense of professionalism and responsibility. Implications are discussed for empowering teachers, and other stakeholders, to exercise responsibility in the context of an accountability system.
... These results provide slightly suggestive evidence of Republican emphasis on mathematics over English language arts, mirroring a national trend where conservatives de-emphasize liberal arts in favor of science and math (Cohen, 2016). The experience finding may also speaks to the important of teacher experience in ELA, for which preparing students for state tests of reading comprehension may require more experience and content knowledge (Jennings & Bearak, 2014). Though this When looking at historically disadvantaged students, these relationships change somewhat in Tables 12-13. ...
Article
Education reform rhetoric frequently pits the vested interests of teachers’ unions against those of students and families. To test whether union restrictions are related to student learning, I analyze a unique database of contractual items for the 2016-2017 school year across all 499 Pennsylvania school districts in order to examine a) variation, b) partisan political predictors, and c) relationships to student achievement and graduation rates. I also examine changes in 105 contracts that occurred during the 2015-2016 school year. I depict variation among items using GIS mapping. I use OLS regression, probit regression, and spatial autoregression to examine relationships between contract features and student proficiency and graduation rates. I also use propensity score weighting with generalized boosted models (GBM). After controlling for spatial dependence and district demographics, I find a significant negative relationship between the percentage of registered Republicans in a district and bonuses for teacher graduate credentials. I find a significant and positive relationship between Republican registered voters and math and science proficiency. This relationship diminishes in magnitude for ELA proficiency. I also find a significant positive relationship between average years of teaching experience and ELA proficiency in grades 3-8. Using GBM, I find significant positive estimates (+2%) of teacher qualification indicators on students’ math achievement in grades 3-8, and a significant positive estimate (+2%) between harsh consequences for ELA teachers and student proficiency. I also find a significant positive estimate between higher teacher pay and biology proficiency (+4% for historically disadvantaged students), as well as a significant negative estimate of graduate credential bonuses on graduation rates (-6%). These correlational results suggest that subject-area and grade-level differentiation in contracts – such as higher wages for STEM teachers – might be beneficial. The most effective STEM teachers might be seeking out positions in the best-paying districts with the strongest contracts.
... Other schools' test score gains may owe more to behaviors that have less to do with sustainable learning. Jennings and Bearak (2014) find that most of the score gains in several large states come from the most common question types, implying that such questions are particularly emphasized in the test preparation as well as in the testing. Jacob (2005) examines a new accountability policy in the Chicago public school system and finds that schools that raise high-stakes exam scores often do not raise low-stakes exam scores, as the schools focus heavily on test preparation, retention of underperforming students, and careful selection of the set of students to be tested. ...
Thesis
In this dissertation, I apply administrative data from Michigan's public schools to address crucial policy questions in the economics of education. In Chapter I, I shed light on the persistent effects of attending high-quality high schools by creating a value-added model that isolates each high school's effect on students' test scores, then matching the results to students' college transcripts to determine the relationship between high-school value added and first-year college grades. I find that students who attend high schools with one standard deviation higher value added receive first-year grades about 0.09 grade points higher than their otherwise-identical counterparts. These gains in college are not driven solely by math and English, but are evenly distributed across subjects. This result is robust to adjustments for a number of potential biases that arise throughout the process, including selection into high schools and selection into college attendance. Overall, I find evidence against some of the more skeptical interpretations of test-score improvement, such as the claim that schools "teach to the test" or the concern that the content tested on standardized exams is not relevant to future learning. Human capital theory suggests that when students would graduate into a weak labor market, the opportunity cost of schooling declines, and they should instead invest in themselves and get more education. However, this assumes that they have no borrowing constraints; if students are credit-constrained and their families are hurt by the struggling labor market, then their educational options may actually diminish as they are less able to pay for college. In Chapter II, I determine which of these effects predominates empirically using data on plant closings and mass layoffs from the Worker Adjustment and Retraining Notification ("WARN") Act, examining the impact of exposure to job losses during the senior year of high school on whether and where students attend college. A 1-standard deviation increase in per-capita job losses is associated with a small but statistically-significant 0.2-percentage point increase in the probability of attending college, driven entirely by attendance at community colleges. This result supports the argument that the opportunity cost effect dominates, as any movement out of college as a result of credit constraints and firsthand exposure to job losses is comparatively small. Having access to an effective and experienced teacher can make a crucial difference in a student's academic achievement. In Chapter III, written with Kolby Gadd, I examine the factors that predict whether teachers will stay in their first jobs or leave for opportunities elsewhere, and then study how students perform after teachers leave, looking both at teacher turnover in general and at teacher departures for particular destinations such as new districts or the private sector. In a multinomial logit framework in which we examine each teacher's employment status in his or her fifth year, we find that the characteristics that predict departures most consistently are the fraction of Black students in the teacher's first school, the fraction of economically-disadvantaged students in the teacher's first school, the teacher's first job being in special education, and the teacher's first school being a charter. Turning to how student achievement changes in the wake of teacher turnover, we find a modest decline in test scores after teacher exits, driven by students in schools that lost teachers to other full-time teaching positions, both within and across districts.
... Kippers, Wolterinck, Schildkamp, Poortman, and Visscher (2018) found that although secondary school teachers in the Netherlands use various kinds of classroom assessments for information on student achievement, the systematic analysis of these data to support teaching has not been integrated into their practice. There is also some evidence to suggest that teachers engage in inappropriate test preparation practices in response to outcome data (e.g., teaching to the test) (Jennings & Bearak, 2014). Additionally, Mausethagen, Prøitz, and Skedsmo (2018) work demonstrated that data use by Norwegian secondary school teachers usually aims to address short-term teaching goals and improve test results. ...
Article
Full-text available
Evidence suggests that the quality of teachers’ instructional practices can be improved when these are informed by relevant assessment data. Drawing on a sample of 1,300 primary school teachers in Ireland, this study examined the extent to which teachers use standardized test results for instructional purposes as well as the role of several factors in predicting this use. Specifically, the study analyzed data from a cross-sectional survey that gathered information about teachers’ use of, experiences with, and attitudes toward assessment data from standardized tests. After taking other teacher and school characteristics into consideration, the analysis revealed that teachers with more positive attitudes toward standardized tests and those who were often engaged in some form of professional development on standardized testing tended to use assessment data to inform their teaching more frequently. Based on the findings, policy and practice implications are discussed.
... Under the "No Child Left Behind" (NCLB) program important state standards were excluded from instruction as teachers altered their practices to focus on predictably tested skills. The choices made regarding measurement directed instruction away from elements of the intended curriculum (Jennings & Bearak, 2014). A phenomenon now broadly referred to as teaching to the test, a practice that reduces the depth of instruction and narrows the curriculum (Volante, 2004). ...
Article
Full-text available
Data-driven decision making as an extension of test-based accountability policies for educational reform and improvement promises new insights into efficient and effective leadership. An examination of the context surrounding the implementation of this decision making model, particularly relationships of power that serve to enframe the discourse surrounding education, reveal fundamental problems with the implementation of data-driven decision making models. This paper contends that under current contexts the practice at best constitutes a form of illiteracy, and at worst may undermine the public and democratic purposes of education. It is concluded therefore that what is needed in education is not data-driven decision making, but rather principled leadership and a moral framework for the use of information by educators. This leadership should be informed by the application of a logic model for program evaluation, and a democratic discourse led by educators.
... Teacher preparation methods have received increased focus in the current era of increased demands on teachers to support K-12 (i.e. kindergarten to Year 12) student achievement (Jennings & Bearak, 2014). Researchers (e.g., Chazan et al., 2018;Grossman et al., 2009;Trevisan et al., 2019) recommend that teacher education programmes should incorporate practice-based experiences that approximate teaching, meaning provide experiences that are similar to actual teaching practice. ...
Article
The use of technology affords opportunities for prospective teachers to engage in actions that are proximal to the work of teaching. The authors designed a task in which prospective teachers (n = 95) at four institutions created an animation or depiction of a classroom scenario using one of two technology platforms: GoAnimate or LessonSketch. They used a convergent mixed-methods design in which qualitative findings were quantitised and then examined statistically to determine what technological aspects prospective teachers used to create their approximations of practice, as well as how they perceived platform use. Findings indicate that the perception of being fun, having a learning curve and being difficult to use statistically affected evaluation. Two technological aspects also had a significant effect on perception: mathematical representation and altered visual field. Findings imply that prospective teachers’ experiences with technology use, coupled with how they use the tools, impact their appraisal of the platform.
... Amidst increasing pressure and limited instructional hours, teachers may resort to "teaching to the test" in an effort to demonstrate learning gains on these standardized assessments [3,5,6]. Unfortunately, this practice leads to inaccurate inferences about the knowledge and skills that students have acquired and unreliable inflation in scores on state-level standardized assessments that are not achieved on the NAEP [5,7], nor do these types of tests reflect the types of literacy tasks that the student will encounter outside of the testing room [8,9]. The result is that students fail to develop the comprehension strategies that will better serve them outside of a standardized test [10]. ...
Article
Full-text available
Literacy skills are critical for future success, yet over 60% of high school seniors lack proficient reading skills according to standardized tests. The focus on high stakes, standardized test performance may lead educators to “teach-to-the-test” rather than supporting transferable comprehension strategies that students need. StairStepper can fill this gap by blending necessary test prep and reading comprehension strategy practice in a fun, game-based environment. StairStepper is an adaptive literacy skill training game within Interactive Strategy Training for Active Reading and Thinking (iSTART) intelligent tutoring system. StairStepper is unique in that it models text passages and multiple-choice questions of high-stakes assessments, iteratively supporting skill acquisition through self-explanation prompts and scaffolded, adaptive feedback based on performance and self-explanations. This paper describes an experimental study employing a delayed-treatment control design to evaluate users’ perceptions of the StairStepper game and its influence on reading comprehension scores. Results indicate that participants enjoyed the visual aspects of the game environment, wanted to perform well, and considered the game feedback helpful. Reading comprehension scores of students in the treatment condition did not increase. However, the comprehension scores of the control group decreased. Collectively, these results indicate that the StairStepper game may fill the intended gap in instruction by providing enjoyable practice of essential reading comprehension skills and test preparation, potentially increasing students’ practice persistence while decreasing teacher workload.
... Bajo esta racionalidad, una actuación escolar «atada» a las formas y contenidos que la aprobación de las evaluaciones externas demanda, se transforma en un «adiestramiento» que quita tiempo para otro tipo de actividades, contenidos, interacciones y aprendizajes más significativos para los estudiantes (Heredia Latorre y Kline, 2018;Jennings y Bearak, 2014). La estandarización del pensamiento y la conducta por parte de estudiantes y profesorado afecta negativamente al desarrollo de métodos y actividades diversas -como así también al uso de recursos y materiales variados-en las propuestas pedagógicas. ...
Book
Full-text available
Este libro es la revisión completamente revisada del libro Evaluaciones Externas Mecanismos para la configuración de representaciones y prácticas en educación
Chapter
This chapter employs several Marxist concepts to explore the role of high-stakes, standardized tests within capitalist-oriented reforms and policies in the United States. Starting with a materialist history of the rise of standardized testing within the context of capitalist schooling, this chapter then moves on to analyze the role of testing in the quantification of students, teaching, and learning for use within systems of educational policy built around quasi-free markets. This chapter then argues that, instead of measuring teaching and learning, modern-day high-stakes standardized tests actually measure what Marx referred to as socially necessary labor time. The conclusion of this chapter briefly discusses resistance to high-stakes, standardized testing and offers some suggestions for alternative assessments.
Book
Full-text available
Esta obra representa el resultado de un proceso de investigación y reflexión en el campo de las ciencias de la educación. En su contenido, se exponen experiencias que emergen de una variedad de enfoques de investigación, incluyendo la investigación-acción, la sistematización de experiencias, la investigación autobiográfica, estudios de caso y la autoetnografía. Indudablemente, el conocimiento generado a través de esta obra proporciona valiosos elementos para una comprensión más profunda de las complejidades y desafíos que enfrentan aquellos que se dedican a formar a los futuros profesionales de la educación.
Article
Propensity score analyses (PSA) of continuous treatments often operationalize the treatment as a multi-indicator composite, and its composite reliability is unreported. Latent variables or factor scores accounting for this unreliability are seldom used as alternatives to composites. This study examines the effects of the unreliability of indicators of a latent treatment in PSA using the generalized propensity score (GPS). A Monte Carlo simulation study was conducted varying composite reliability, continuous treatment representation, variability of factor loadings, sample size, and number of treatment indicators to assess whether Average Treatment Effect (ATE) estimates differed in their relative bias, Root Mean Squared Error, and coverage rates. Results indicate that low composite reliability leads to underestimation of the ATE of latent continuous treatments, while the number of treatment indicators and variability of factor loadings show little effect on ATE estimates, after controlling for overall composite reliability. The results also show that, in correctly specified GPS models, the effects of low composite reliability can be somewhat ameliorated by using factor scores that were estimated including covariates. An illustrative example is provided using survey data to estimate the effect of teacher adoption of a workbook related to a virtual learning environment in the classroom.
Article
An increased emphasis on writing standards has led many U.S. states to incorporate on-demand writing assessments into their test-based accountability system. We argue this creates political and pedagogical tensions for teachers to navigate. We discuss how rubric conceptualization (1) is a process wherein a teacher iteratively (co-)constructs meaning from a rubric’s design via classroom instruction; (2) is informed by implicit theories of learning; and (3) often requires a teacher to negotiate the competing pedagogical and political meanings of a rubric. While test-based accountability frameworks promote rubric use that equates learning with student achievement, rubric conceptualization is a process where teachers have some agency to resist behaviorist approaches to instruction.
Article
Full-text available
Decades after the 1972 United Nations Conference on the Human Environment held in Stockholm declared environmental education (EE) an essential tool to mitigate environmental challenges, the implementation of EE still faces many obstacles. Accordingly, innovative, and solution-oriented approaches remain vital to enable environment-driven pedagogy in formal and non-formal education settings. This paper, which is located within the context of a case study that was conducted with the aim to investigate the application of distributed leadership in the teaching of EE in South Africa, reports on hierarchical power relations as impediments to curriculum transformation and implementation and, by extension, a hindrance to the infusion of EE in pedagogy. The results of this study suggest that hierarchical power relations in the schooling system hamper the involvement and participation of various stakeholders in key decision-making responsibilities, particularly, curriculum management. Accordingly, processes such as curriculum modification which are essential to enable the implementation of EE are impeded. The researchers of the current study argue that, based on its marked successes in various spaces, especially in the realm of education; distributed leadership could be one of the viable agencies to enable EE implementation.
Chapter
This chapter will explore a prospective alignment of out-of-school time (OST) and in-school SEL programming to afford students of color the opportunity to develop holistically within in-school and out-of-school settings where they may be disproportionally challenged by conditions of poverty, racial/ethnic stereotyping, history of failure, educator turnover, and a culture where there has been significant pessimism about their ability to be successful in an educational environment of rigorous standards and continuous standardized assessments. A research-based dual capacity building framework will be introduced that will allow for culturally-responsive SEL efforts to be coordinated and linked to in-school academics and in OST programming. The framework will allow for culturally responsive social emotional learning development to be learned, supported, and valued in dual contexts.
Chapter
Human beings vary in their endowments or talents. Diverse genetic and environmental factors give rise to unique strengths, interests, and preferences that are responsible for a broad spectrum of human variability. Consequently, humans have “jagged profiles” of abilities, skills, attitudes, social-emotional qualities, and even physical attributes (Rose, The end of average: How we succeed in a world that values sameness (1st ed.). HarperOne, 2016). Although jagged profiles are immensely useful for fostering ingenuity and technological achievement, not all talents, attitudes, and qualities are equally valued in a society. Some talents are considered more desirable and thus worthy of investment, while others are considered undesirable, to be suppressed, corrected, or normalized.Schools play a significant role in developing the desirable and correcting the undesirable according to socially accepted definitions of desirable and useful. When students differ from what is considered “normal”, they are placed in systems to support the desirable. Both gifted and special education programs represent polar ends of the education system. Gifted education is intent on accelerating or expanding, whereas special education is intent on remediating, correcting, and replacing deficient behaviors and skills. However, society has changed. What was previously deemed desirable may actually be less desirable or obsolete and what used to be undervalued has gained more value.In this article, we discuss the need to change education to focus on individual talent, strengths, interests, and preferences. We provide examples of the changes and discuss how education should treat all education the same and promote a talent-based transformational gifted education for all students.KeywordsLearner diversityTalent developmentTransformative giftednessSpecial educationGifted and talented education
Article
Promoting English as a foreign language (EFL) students’ reading comprehension, improving their reading motivation, and reducing their anxiety have always been the focus of EFL researchers. Educational board games have gradually received attention from educators. Various studies have found that board game-based teaching can improve students’ learning performance, motivation, and classroom engagement, and contribute to increasing discussion. However, there are few discussions on board games in the field of EFL, so further discussion is necessary to fill the research gap of board games in EFL reading comprehension. This study developed a dual-hierarchical scaffolding board game activity framework integrating game-based learning (GBL) and scaffolding strategies, and used card board games as teaching materials to promote EFL students’ reading comprehension. The game activity framework provides a scaffolded language learning environment with dual-hierarchy concepts to support students in building successful reading strategies. Moreover, the verbal and visual representations are displayed together on the cards game component. Game mechanics can help students to link verbal and image information. On the other hand, this framework emphasizes allowing students to first use the group discussion and collaborative reading mode to conduct self-directed and game-based learning activities with the dual-hierarchy scaffolding strategy. Then the teacher guides the students to use the concepts learned in the game activities to analyse and read the text so that they can understand the text content. A quasi-experimental design was conducted in a high school English reading course to evaluate the effectiveness of the proposed framework. The experimental results showed that the proposed framework not only significantly promoted the EFL students’ reading comprehension and learning motivation and decreased their learning anxiety, but also helped them maintain a high level of learning motivation in the process of the activities. It is expected that the proposed structure can lead to better EFL teaching practices.
Article
We examine how incentives for test prep varied between math and English language arts (ELA) on U.S. state accountability exams. We collected data on exam structure for grade 3 to 8 tests in six states that are the setting for most U.S. research in literatures where accountability matters. We show that math exams typically measured ability more precisely for students on the margin of achieving proficiency. This gave educators an incentive to spend more time preparing students for math tests than for ELA tests, consistent with the common finding of larger math effects in the literature.
Chapter
The school leader can develop trust through collegial interaction, learning-centered leadership, and democratic decision making. Trust across the school community is “contagious.” For example, when teachers trust the school leader, they are more likely to trust each other, students, and parents/guardians, and parents/guardians are more likely to trust the school. Teachers are more likely to trust the school curriculum if they assess and adapt it, and more likely to build trust in each other through collaborative curriculum development. A social-constructivist curriculum is more likely to increase trust between teachers and students and among students. Most high-stakes achievement tests are untrustworthy for several reasons: they measure lower-level cognitive skills, do not provide a balanced assessment across designated content areas, lead to teaching to the test and a focus on breadth rather than depth, and may not accurately reflect student learning. Student self-assessment, teacher observation, and portfolio defense are a few of the more trustworthy ways of assessing student learning. Key features of school-family trust are inclusion of, commitment to, communication with, and collaboration with families. The key to school-community trust is engagement with the community based on in-depth knowledge of the community. Three models for engagement and trust building are the school-community council, the community leadership team, and community-school partnerships.
Article
The equity-efficiency trade-off and cumulative return theories predict larger returns to school spending in areas with higher previous investment in children. Equity-not efficiency-is therefore used to justify progressive school funding: spending more in communities with fewer financial resources. Yet it remains unclear how returns to school spending vary across areas by previous investment. Using county-level panel data for 2009-18 from the Stanford Education Data Archive, the Census Finance Survey, and National Vital Statistics, the authors estimate achievement returns to school spending and test whether returns vary between counties with low and high levels of initial human capital (measured as birth weight), child poverty, and previous spending. Spending returns are higher among counties with low previous investment (counties that also have a high percentage of Black students). Evidence of diminishing returns by previous investment documents another way that schools increase equality and establishes another argument for progressive school funding: efficiency.
Article
Access to education for all students has been long sought. Once defined as physical access a school building, the concept of access has evolved since Brown v. Board of Education. The purpose of this policy review, conducted through archival research, is to examine the evolution of access to K–12 education for all students, with an emphasis on students with disabilities who are general education students first, to understand the implications of the 2020 U.S. 6th Circuit Court of Appeals Gary B. vs. Whitmer decision. For the first time, a federal court ruled that the constitution affords all students “a fundamental right to a basic minimum education.” Specifically, the evolving concept of access to education for all students including students with disabilities across (a) the school building, (b) curricular opportunities, (c) education outcomes, and (d) a college- and career-ready curriculum is outlined using landmark K–12 federal education legislation, court cases, and policy initiatives. Taken together, a basic minimum education includes access to challenging academic learning objectives, an emphasis on literacy, provision of educational materials of sufficient quantity and quality, and an adequate teacher workforce. Meeting these expectations assures genuine access to a public education for all students.
Conference Paper
Full-text available
In European countries like France and Germany, civic and citizenship education are widely and controversially discussed in the public and in social science research. A range of educational policies are proposed and implemented which try to exploit citizenship education for the betterment of societal integration and cohesion and for the appeasement of political and socio-economic conflicts. Schools in general and citizenship education in particular are under pressure of high societal and political expectations. Although education and schools in general are classical topics of the économie des con-ventions since decades, the more specific field of citizenship education has been largely ignored until recently. Against this backdrop, the paper gives a brief outline of a genuine conventionalist approach for analysing citizenship education in schools in a transnational perspective. The key elements of this approach are situations as the key units of analysis, the plurality of situations and of persons who refer to plural orders in these situations, the controversial character of situations and the striving for establishing compromises with regard to certain orders of worth. The concept of interlinked situations will be introduced as an alternative to the customary multi-level analysis. The paper argues that methodological situationalism allows overcoming the systematic shortcomings of international comparison. It is a conceptual paper based on common research with Andrea Szukala and Claude Proeschel (Hedtke, Proeschel and Szukala 2017, Hedtke, Szukala and Proeschel 2019). The key questions are: (1) What characterises a comparative conventionalist approach? (2) What can be better observed and analysed with the help of a conventionalist-situationalist approach? (3) Is a conventionalist approach appropriate with regard to the always conflict-laden area of citizen-ship education? (4) How should seemingly similar situations be compared which are embedded in different institutional contexts across national borders?
Thesis
Increasingly, educational networks are regarded as unique organizational arrangements capable of supporting large-scale instructional improvement. While a portion of the existing research on educational networks takes on matters of efficacy to improve outcomes, there is more limited research focused on understanding the core work of running educational networks. As activity around educational networks proliferates to include many types of networks existing in and around schools, there is a need to establish analytic frameworks that help researchers to 1) understand and reason about the core work of educational networks and 2) compare across different network types. This study moves on this agenda by 1) developing an analytic framework for understanding educational networks and 2) empirically testing that framework using a cross-case analysis of two networks positioned in different market sectors. This study finds fundamental distinctions in the networks’ designs for instructional improvement: one leveraging a highly-specified, fidelity-based approach; the other leveraging a less-specified, adaptive approach. This study also finds that four key network dimensions--structure, governance, composition, and purpose, help to explain the networks’ designs for improvement. The key lesson of this study underscores the inherent complexity and interdependent nature of designing, managing, and studying educational networks.
Article
A central concern surrounding test-based accountability is that teachers may narrow teaching practices to improve test performance on a curriculum-based specific knowledge test rather than student learning more broadly. Two of the most common teaching practices that “teach to the test” are providing test-specific classwork and increasing the frequency with which students take practice tests. Whether such teaching practices improve student learning—both in terms of learning the content associated with a specific knowledge test as well as more general learning—is a largely unanswered question. To approach this question, this paper uses a student fixed effects approach to analyze the impact of these kinds of narrow teaching practices on student performance on a specific test as well as a general knowledge test. We find that test-specific classwork and practice tests with specific test items tend to have little or negative impacts on curriculum specific or general knowledge test performance, except for male students, and that subject practice tests (without emphasizing test-specific items) have positive effects on student outcomes on both kinds of tests, but larger on the curriculum-specific than on the general test, and much larger on the curriculum-specific test for male students. We discuss the logic for these results and what they tell us about the effectiveness of test-focused teaching practices more generally.
Article
Full-text available
At the global level, the policies of standardization and results-based accountability have expanded, and in recent years, the simultaneous demand for more inclusive schools has increased. Some of the central actors for attaining inclusion are special education teachers, yet little research has focused on them or on their work. The current study has attempted to understand how the policies of standardization and results-based accountability influence these teachers' work and attitudes. To reach this goal, a case study in Chile conducted interviews with ten special education teachers, followed by the selection of two of them to participate in ethnographic follow-up for five months (240 hours of observation). The findings reveal teacher identities caught between inclusion and standardization, along with evidence of teacher resistance to the mandates of accountability.
Article
Full-text available
School districts across the country have adopted interim and benchmark assessments in response to NCLB pressures to raise student achievement, despite the lack of a research base. Thus, it is especially important that well-conceived, empirical studies of the effects of such programs be carried out. Various theoretical frames can help understand how teams of teachers might begin to use data to reflect upon and adjust their instructional practices. At issue is whether an idealized data-based decision-making theory of action will play out in practice. Or, will teaching-the-test practices instead be exacerbated? What we learn from the empirical studies reviewed here is that positive examples, where assessment results were coherently linked to curriculum and instruction, were facilitated by highly committed principals and teacher leaders. But, these examples were rare. More frequently, interim assessments results appeared to be used, item by item, to reteach steps in problems that were missed without attending to underlying concepts or gaining diagnostic insights.
Article
What happens to science and the arts in a back-to-basics accountability program? The authors' survey of teachers in North Carolina provides the answer to this and other questions about the impact of that state's most recent reform plan.
Article
The Percentage of Proficient Students (PPS) has become a ubiquitous statistic under the No Child Left Behind Act. This focus on proficiency has statistical and substantive costs. The author demonstrates that the PPS metric offers only limited and unrepresentative depictions of large-scale test score trends, gaps, and gap trends. The limitations are unpredictable, dramatic, and difficult to correct in the absence of other data. Interpretation of these depictions generally leads to incorrect or incomplete inferences about distributional change. The author shows how the statistical shortcomings of these depictions extend to shortcomings of policy, from exclusively encouraging score gains near the proficiency cut score to shortsighted comparisons of state and national testing results. The author proposes alternatives for large-scale score reporting and argues that a distribution-wide perspective on results is required for any serious analysis of test score data, including “growth”-related results under the recent Growth Model Pilot Program.
Article
This paper focuses on ways in which one state policy for improving education--standard-setting through testing mechanisms--affects the classroom teacher-learner relationship. That uniform policy-making is problematic is clear from observations of 43 Mid-Atlantic school district teachers. Responding to three types of standards, 45 percent found minimum competency testing objectionable because a single measure cannot allow for student, resource, and goal differences. Likewise, standardized testing for decision-making about students was typically viewed as curriculum narrowing. The strongest reaction stemmed from competency-based approaches to teaching and learning that require test-passing for each discrete skill before moving on. Teachers generally found it difficult to adapt standard policies to the disparate needs of students, though many recognized the usefulness of a common educational direction. The need for dual accountability--to students and administration--is a problem that could be partially rectified through ensuring competency among teachers. Nonetheless, teachers familiar with the competency-based teacher certification idea recently advanced by policy-makers again tended to oppose it: like teaching itself, learning to teach is a complex activity requiring behaviors varying from student to student, an attitude research confirms. Paper and pencil competency tests were also viewed skeptically; 60 percent of teachers opposed tests for recertification. In sum, policymakers must realize the collective impact of such policies since they may make teaching less attractive and thus work against themselves. (KS)
Article
The principle that important decisions should not be based on a single measure is axiomatic, if widelu ignored in practice. The traditional rationale is the risk of incorrect decisions from incomplete and error-prone data. The current high-stakes uses of test scores increase the need for multiple measures for two distinct reasons: the risk of score inflation and the potential for perverse incentives for educators and students. Addressing these two issues may require focusing accountability on measures of schooling as well as a much wider range of measures of student outcomes. The difficulties of pursuing this approach are described, and some possible directions for research and development are noted.
Article
What is the impact of an external testing program on what teachers teach and the way they teach it? How does an external testing program affect a school's organization? Are these effects contrary to the general goal of improving schools?
Article
Many policymakers feel pressure to claim that No Child Left Behind (NCLB) is boosting student performance, as Congress reconsiders the federal government’s role in school reform. But how should politicians and activists gauge NCLB’s effects? The authors offer evidence on three barometers of student performance, drawing from the National Assessment of Educational Progress (NAEP) and state data spanning the 1992–2006 period. Focusing on the performance of fourth graders, where gains have been strongest since the early 1970s, the authors find that earlier test score growth has largely faded since enactment of NCLB in 2002. Gains in math achievement have persisted in the post-NCLB period, albeit at a slower rate of growth. Performance in many states continues to apparently climb. But the bar defining proficiency is set much lower in most states, compared with the NAEP definition, and the disparity between state and federal results has grown since 2001. Progress seen in the 1990s in narrowing achievement gaps has largely disappeared in the post-NCLB era.
Article
The recent federal education bill, No Child Left Behind, requires states to test students in grades 3 to 8 each year and to judge school performance on the basis of these test scores. While intended to maximize student learning, there is little empirical evidence about the effectiveness of such policies. This study examines the impact of an accountability policy implemented in the Chicago Public Schools in 1996–1997. Using a panel of student-level, administrative data, I find that math and reading achievement increased sharply following the introduction of the accountability policy, in comparison to both prior achievement trends in the district and to changes experienced by other large, urban districts in the mid-west. However, for younger students, the policy did not increase performance on a state-administered, low-stakes exam. An item-level analysis suggests that the observed achievement gains were driven by increases in test-specific skills and student effort. I also find that teachers responded strategically to the incentives along a variety of dimensions—by increasing special education placements, preemptively retaining students and substituting away from low-stakes subjects like science and social studies.