Article

Controlling for Student Background in Value-Added Assessment of Teachers

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The Tennessee Value-Added Assessment System measures teacher effectiveness on the basis of student gains, implicitly controlling for socioeconomic status and other backgroundfactors that influence initial levels of achievement. The absence of explicit controls for student background has been criticized on the grounds that these factors influence gains as well. In this research we modify the TVAAS by introducing commonly used controls for student SES and demographics. The introduction of controls at the student level has a negligible impact on estimated teacher effects in the TVAAS, though not in a simple fixed effects estimator with which the TVAAS is compared. The explanation lies in the TVAAS's exploitation of the covariance of tests in different subjects and grades, whereby a student's history of test performance substitutes for omitted background variables.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The Colorado Growth Model (Betebenner, 2009) uses the previous year's test score as a covariate in a quantile regression model. Other VAMs have been defined using mixed models (Sanders et al., 1997;Raudenbush and Bryk, 2002;Ballou et al., 2004;McCaffrey et al., 2004;Lockwood et al., 2007;Harris and McCaffrey, 2010;Wright et al., 2010;Mariano et al., 2010), in which the response is a vector of student scores over time and teacher contributions are modeled through random effects. In mixed models, the empirical best linear unbiased predictors (EBLUPs) of the teacher random effects serve as the VAM scores. ...
... The models cited above also assume that every student has complete data over the time period studied, or that missing data patterns have no information about teacher effectiveness. Ballou et al. (2004) note that longitudinal mixed model approaches allow students to have missing test scores for some years by including a partial vector of responses, but such analyses assume that the data are missing at random. Missing data are ubiquitous in longitudinal education data. ...
... . Furthermore, inference usually focuses on the current year VAM effects. In this analysis, we focused on the future year effect from the GP VAM rather than the current year effects.Ballou et al. (2004),Lockwood et al. (2007), and ...
Preprint
Value-added models have been widely used to assess the contributions of individual teachers and schools to students' academic growth based on longitudinal student achievement outcomes. There is concern, however, that ignoring the presence of missing values, which are common in longitudinal studies, can bias teachers' value-added scores. In this article, a flexible correlated random effects model is developed that jointly models the student responses and the student missing data indicators. Both the student responses and the missing data mechanism depend on latent teacher effects as well as latent student effects, and the correlation between the sets of random effects adjusts teachers' value-added scores for informative missing data. The methods are illustrated with data from calculus classes at a large public university and with data from an elementary school district.
... For the purposes of this review, VAMs are defined as complex regression models via which modelers use students' histories of scores on academic achievement tests to determine those students' expected scores or expected gains on current achievement tests, and thereby estimate the specific contributions of individual teachers (often referred to as "teacher effects") to observed gains (or losses) on those tests (AERA, 2015; American Statistical Association [ASA], 2014;Braun, 2005). Some VAMs include student attributes, such as socioeconomic status, disability, attendance, or English learner status, as covariates to adjust estimates of teacher effects, and some do not (Ballou et al., 2004;McCaffrey et al., 2004). Regardless of the specific procedure used, though, VAM scores clearly play an important role in many US (and international; Araujo et al., 2016;Levy et al., 2019;Sahlberg, 2011;Sørensen, 2016;Smith & Kubacka, 2017) teacher evaluation and accountability policies and systems. ...
... The significant difference between the weighted and unweighted means, again, suggested that papers in which authors challenged the valid use of VAM in teacher evaluation was not promoted by VAM scholars as broadly as papers which supported the general use of VAMs. Specifically, authors of eight of the articles directly studying this issue found evidence suggesting freedom from bias, thus supporting the use of VAMs in teacher evaluation (Backes et al., 2018;Ballou et al., 2004;Chetty et al., 2014aChetty et al., , 2014bGoldhaber & Chaplin, 2015;Goldhaber, Cowan et al., 2013;Lockwood, McCaffrey, Hamilton, et al., 2007;Loeb et al., 2014), while double that number, or authors of 16 articles found evidence of bias, thus challenging the IUA (Ballou & Springer, 2015;Castellano et al., 2014;Chetty et al., 2016;Dieterle et al., 2015;Goldhaber, Goldschmidt et al., 2013;Kupermintz, 2003;Lockwood & McCaffrey, 2014;Martineau, 2006;McCaffrey et al., 2004;Polikoff & Porter, 2014;Rothstein, 2009Rothstein, , 2010Stacy et al., 2018). ...
... When comparing confidential supervisor evaluations to VAM results, Harris and colleagues Harris & Sass, 2014) found that principals considered criteria beyond teachers' contributions to student test scores when they evaluated teacher quality, which they distinguished from teacher effectiveness (as measured by VAMs; see also Backes et al., 2023). This suggests that there may be more to the construct than can measured by student test scores alone. ...
Article
Full-text available
Local education agencies (LEAs) continue to use value-added models (VAMs) for teacher evaluation policies and purposes, often with consequences attached. Although the Every Student Succeeds Act (ESSA) provides more flexibility to LEAs, few have discontinued VAM use, suggesting they interpret VAMs as a valid measure of teacher effectiveness. In this systematic review, we used a framework built on the Standards of Educational and Psychological Testing (AERA et al., 2014) to examine validity evidence contained in 75 articles published in high-quality, peer-reviewed journals in which article authors supported or challenged user interpretations and uses of VAMs. Results with implications for educational policy are presented.
... However, not all value-added models control for student background. For example, some researchers who have used value-added models have argued that sometimes controlling for student background may over adjust the teacher effects estimates (Ballou, Sanders, & Wright, 2004;Sanders & Rivers, 1996). ...
... Previous work has also examined the persistence of teacher effects on student achievement using value-added models (e.g., Ballou, Sanders, & Wright, 2004;McCaffrey, Lockwood, Koretz, Louis, & Hamilton, 2004;Sanders & Rivers, 1996). For example, Sanders and Rivers (1996) used a value-added model to predict the teacher effects in grades 3, 4, and 5 on fifth-grade achievement, controlling for achievement in second grade. ...
... Previous work has examined the cumulative nature of teacher effects on student achievement using value-added models (e.g., Ballou, Sanders, & Wright, 2004;McCaffrey, Lockwood, Koretz, Louis, & Hamilton, 2004;Sanders & Rivers, 1996). For example, Sanders and Rivers (1996) used a value-added model to predict the teacher effects in grades 3, 4, and 5 ...
Article
Full-text available
We examined the persistence of teacher effects from grade to grade on lower-performing students using high-quality experimental data from Project STAR, where students and teachers were assigned randomly to classrooms of different sizes. The data included information about mathematics and reading scores and student demographics such as gender, race, and SES. Teacher effects were computed as residual classroom achievement within schools and within grades. Then, teacher effects were used as predictors of achievement in following grades and quantile regression was used to estimate their persistence. Results consistently indicated that all students benefited similarly from teachers. Overall, systematic differential teacher effects were not observed and it appears that lower-performing students benefit as much as other students from teachers. In fourth grade there was some evidence that lower-performing students benefit more from effective teachers. Results from longitudinal analyses suggested that having effective teachers in successive grades is beneficial to all students and to lower-performing students in particular in mathematics. However, having low-effective teachers in successive grades is detrimental to all students and to lower-performing students in particular in reading.
... Teacher's value-added Ability to improve student learning outcomes as measured by student gains on standardised tests (Ballou, Sanders, & Wright, 2004) or ratings of teachers' performance through classroom observations (Hafen, Hamre & Allen 15 et al., 2015). ...
... In the simplest terms, teachers who are effective enable their students to learn. With the growth of standardised testing, teacher effectiveness has been operationalised as teacher's "value-added", meaning their ability to improve student learning as measured by student gains on standardised tests (Ballou, Sanders, & Wright, 2004) or ratings of teachers' performance through classroom observations 36 (Hafen, Hamre & Allen et al., 2015). However, a debate remains whether teachers' impacts on students' test scores is an appropriate measure of their effectiveness and to what extent -and how they can be used for accountability purposes. ...
... In line with the scope of the study, the literature reviewed also focused on school-related variables. This revealed that adding school-level predictors makes little difference to the teacher value-added effectiveness estimates [18,24,32,33], except for one study suggesting that including the percentage of students receiving special education services at the school level in the estimates has a benefit [14]. ...
... Supporting the finding about the ineffectiveness of school-level variables, the retrieved literature on the effectiveness of teachers also indicates that school-level measures, the percentage of students receiving free/reduced-price lunches, class size, racial/ethnic composition, students with special educational needs and those with English as a second language (ESL) accounted for very little of the variance in student attainment [33]. Ballou et al. [32] suggest that controlling the percentage of students eligible for free/reduced-price lunches in schools has a substantial impact on TVAAS (the Tennessee Value-Added Assessment System) estimates in some grades and subjects; however, the authors also emphasize the precision of the models used and advise caution in terms of this finding. Alban [14], using the socio-economic level of the school, the percentage of students receiving special services, enrolment, and mobility and ethnicity as school-level predictors, found that prior achievement is the only significant variable in each estimation, followed by the percentage of students receiving special education. ...
Article
Full-text available
It is widely believed that the teacher is one of the most important factors influencing a student’s success at school. In many countries, teachers’ salaries and promotion prospects are determined by their students’ performance. Value-added models (VAMs) are increasingly used to measure teacher effectiveness to reward or penalize teachers. The aim of this paper is to examine the relationship between teacher effectiveness and student academic performance, controlling for other contextual factors, such as student and school characteristics. The data are based on 7543 Grade 8 students matched with 230 teachers from one province in Turkey. To test how much progress in student academic achievement can be attributed to a teacher, a series of regression analyses were run including contextual predictors at the student, school and teacher/classroom level. The results show that approximately half of the differences in students’ math test scores can be explained by their prior attainment alone (47%). Other factors, such as teacher and school characteristics explain very little the variance in students’ test scores once the prior attainment is taken into account. This suggests that teachers add little to students’ later performance. The implication, therefore, is that any intervention to improve students’ achievement should be introduced much earlier in their school life. However, this does not mean that teachers are not important. Teachers are key to schools and student learning, even if they are not differentially effective from each other in the local (or any) school system. Therefore, systems that attempt to differentiate “effective” from “ineffective” teachers may not be fair to some teachers.
... This strategy intends to mitigate nonrandom sorting. The core assumption underlying this strategy is that students' statistics attitudes measured at the beginning of the courses are sufficient to summarize all the factors that cause the disparities in their statistics attitudes up to that point (Ballou et al., 2004). In practice, those factors are often unmeasurable and/or beyond a teacher's control. ...
... Lastly, our analysis provides empirical support for the use of multilevel models with the need to control for student background, as advised by Ballou et al. (2004). Our findings on the relationships between student gender and statistics attitudes are consistent with those reported in early studies conducted in other countries than the United States (Ramirez et al., 2012) and one recent study conducted in the United States (van Es & Weaver, 2018). ...
Article
Using data from 23 statistics instructors and 1,924 students across 11 post-secondary institutions in the United States, we employ multilevel covariate adjustment models to quantify the sizes of instructor and instructional effects on students' statistics attitudes. The analysis suggests that changes in students' statistics attitudes vary considerably across statistics instructors. Instructor-associated changes in students' statistics attitudes are positively associated with instructional practices most proximal to tasks involving data as well as with instructors' attitudes toward teaching their statistics classes. Moreover, instructor-associated changes in students' statistics attitudes are positively related to changes in students' expected grades. These findings lend support to previous qualitative findings about links between certain dimensions of teaching practices and students' statistics attitudes.
... Linear mixed models are used for value-added assessment of teachers based on the standardized test scores of their students (Morganstein & Wasserstein 2014). There are many variations of these models, but one particular instance is the complete persistence (CP) VAM (Ballou et al. 2004, Mariano et al. 2010, Karl et al. 2013b. In the CP model, intra-student correlation is modeled with off-diagonal entries in the error covariance matrix (R). ...
Preprint
Full-text available
The Hausman specification test detects inconsistency in mixed model estimators of fixed effects by comparing the original model with an alternative specification in which random effects are treated as fixed. This note illustrates a bias diagnostic from the statistical literature as a complement to the Hausman test. The diagnostic can provide internal estimates of parameter-specific bias in the mixed model estimators without requiring the second, fixed-effects-only model to be fit. We apply the diagnostic to a panel data analysis as well as to a Value-Added Model (VAM) for teacher evaluation.
... Therefore, a substantial proportion of students have data only in math or only in reading. Because many students in Catholic schools lacked demographic data, we used multiple pre-tests, which other authors consider a partial solution to missing demographics (Ballou et al., 2004;Ehlert et al., 2014;Johnson et al., 2015). We estimated models using several matching methods, using multiple pretests in the same subject from adjacent time periods, and using multiple pretests in different subjects from the same time period when available. ...
... Estos avances condujeron a varios países, especialmente de la OCDE, a implementar sistemas operativos no sólo de estudiantes, sino también de profesores y escuelas. Los modelos de valor agregado más utilizados corresponden al enfoque de modelos mixtos implementado por Sanders (2006), el sistema de evaluación del valor añadido de Tennessee (Ballou et al., 2004;Paige et al., 2018;Qin & Zhang, 2022), y los modelos lineales jerárquicos (HLM) (Raudenbush & Bryk, 2002;Franco, 2019). ...
Article
Full-text available
El propósito de esta investigación es medir el valor agregado en el desarrollo de la competencia en razonamiento cuantitativo de las instituciones que ofrecen el programa de Economía en Colombia. Cuando se habla de valor agregado en educación se hace referencia al aporte que realiza una institución, en este caso universitaria, al desarrollo de las competencias académicas de los estudiantes. Para medir el valor agregado se implementó un modelo econométrico multinivel de dos niveles. Dentro de los principales resultados se destaca que las condiciones socioeconómicas de los estudiantes tienen una relación positiva con el logro académico; así mismo, se encontró que las instituciones que mayor valor agregado producen generalmente son las instituciones que más alto desempeño presentan.
... 60). That is, adjusting the TVAAS model to statistically control for student SES and demographic variables did not result in significantly differently value-added scores for teachers (Ballou et al., 2004). Scholars who challenge the merits of valueadded models raise concerns regarding validity and reliability, or consistency of scores over time, finding "teachers classified as 'effective' one year might have a 25% to 59% chance of being classified as "ineffective" the next year, or vice versa" (Amrein-Beardsley & Close, 2019, p. 872). ...
Article
Full-text available
The COVID-19 pandemic disrupted many school accountability systems that rely on student-level achievement data. Many states encountered uncertainty about how to meet federal accountability requirements without typical school data. Prior research provides evidence that student achievement is correlated to students' social background, which raises concerns about the predictive bias of accountability systems. This mixed-methods study (a) examines the predictive ability of non-achievement-based variables (i.e., students' social background) on school districts' report card letter grade in Ohio, and (b) explores educators' perceptions of report card grades. Results suggest that social background and community demographic variables have a significant impact on measures of school accountability.
... The Tennessee Value-Added Assessment System (TVAAS) may also be worth considering, as it is intended to measure the effectiveness of education while excluding socioeconomic status. However, others, including Ballou, Sanders, and Wright [38], have attempted to modify this evaluation system. A recent paper claims that such measurements may not be helpful and calls for more meaningful measures [39]. ...
Article
Full-text available
With e-learning rapidly gaining popularity, evaluating its effectiveness and efficiency has become a challenge in public education, the public sector, and the corporate sector. Measuring knowledge transfer is crucial in any learning process, but e-learning lacks validated methods for this. Here we examine ways to evaluate that particularly in case of e-learning, conducting a literature review to assess available measurement solutions, developing an evaluation method for knowledge transfer, and validating the method. Using logged data from e-courses, it is possible to assess the knowledge transfer in e-learning. We describe a novel method for classifying effectiveness and efficiency with measured values and measurement instruments. The new measurement method was aligned with a data set of an existing learning management system, and the effectiveness and efficiency of knowledge transfer was analysed using quantitative means, including descriptive statistics, regression modelling, and cluster analysis based on a specific e-learning course. This newly elaborated and validated knowledge transfer measurement technique could be a useful tool for anyone wanting to evaluate e-learning courses and can also serve as a baseline for academics to further develop or implement it on larger empirical datasets.
... Through sorting out the goals and indicators of aesthetic education in college and reflecting on the restrictive factors, the paper summarizes the needs of aesthetic education objects (namely students receiving aesthetic education) as the center. The dynamic system model of aesthetic education is composed of seven basic elements, such as aesthetic education object, aesthetic education subject, aesthetic education content, aesthetic education subject-object relationship, curriculum objective, teaching design and teaching reform [2]. The improvement of college students' aesthetic quality is the result of various dynamics in society, school and family. ...
Article
Based on the analysis of the current status of aesthetic education course effectiveness evaluation, this paper constructs an evaluation model of aesthetic education course effectiveness appreciation for the improvement of college students' aesthetic literacy. This paper studies how to realize the effective unification of value-added evaluation and aesthetic education efficiency measurement and the scientific correlation between aesthetic education efficiency and the improvement of college students' aesthetic quality.
... Often underappreciated is how monumental the task of catching up can be: It requires exceptional resources and support, continued over years (Regenstein, 2019). For example, students exhibit a cumulative positive effect of experiencing consecutive years of high-quality teaching and a cumulative negative effect of low-quality teaching (Ballou et al., 2004;Jordan et al., 1997;Sanders & Horn, 1998;Sanders & Rivers, 1996;Sanders et al., 1997). The latter is more probable for high-risk students (Akiba et al., 2007;Darling-Hammond, 2006). ...
Article
A follow-up of a cluster-randomized trial evaluated the long-term impacts of a scale-up model composed of 10 research-based guidelines grounded in learning trajectories. Two treatment groups received the intervention during the prekindergarten year, and one of these groups received follow-through support in kindergarten and first grade. Business-as-usual curricula were used in all other cases, including all years for the control group. Early effects on mathematics achievement decreased through fourth grade but reemerged at fifth grade. These results support both a latent trait hypothesis, whereby stable characteristics of students explain differences in achievement, and a latent foundation hypothesis, whereby early mathematical knowledge and skills provide a foundation for competence in mathematics in later years, especially those that involve challenging mathematics.
... This problem may exist at the school level when students are assigned to schools based on where their parents and guardians live. Across schools, students differ with respect to socioeconomic status, community resources, parental education, and peer and family support of academics, which can possibly impact student learning and performance on accountability measures (Ballou, Sanders, & Wright, 2004;Henderson & Mapp, 2002). These factors are generally beyond the school's control, but are likely to affect any "effectiveness" score based upon students' test performance. ...
... This represents, on the one hand, a challenge for the students, teachers, schools, and the educational system, but on the other hand, it constitutes the possibility to investigate this unique educational learning environment as an anticipatory model for those other educational systems with an increase in student diversity Previous studies which have compared different VA models or the inclusion or exclusion of covariates have found high positive correlations between VA models (most of them above .90; e.g., Ballou, Sanders, & Wright, 2004;Johnson, Lipscomb, & Gill, 2015). However, high correlations will not prevent teachers from being misclassified. ...
... In order to understand the concept of value-added assessment, think of it as a growth curve [9] . Parents stand their child up against a wall and mark their child's height at age 2, 3, 4, and so on, with a pencil on the wall. ...
Article
Full-text available
As a programmatic document to guide the reform of educational assessment, the “General Plan for Deepening the Reform of Educational Assessment in the New Era” clearly points out the requirement for exploring value-added assessment [1]. In the process of exploring, Tennessee Value-Added Assessment System (TVAAS), which was implemented in Tennessee, United States in 1992, has a certain referential significance to the practice of assessment reform in primary English education [2]. This study aims to build a value-added assessment model in line with China’s learning conditions by using big data and carry out pilot experiments in order to promote the development of educational assessment in primary schools.
... Behavioural literature has documented the influence that background information of participants (demographic variables) has on achievement/performance (e.g. Ballou et al., 2004;Casanova et al., 2005). Business literature also highlights the role of enterprise owners/entrepreneurs' demographic characteristics in their enterprises' performance (e.g. ...
Article
Many sub-Saharan African countries promote micro, small and medium enterprises (MSMEs) to play crucial roles in Africa's socio-economic development. However, evidence from previous research highlights that enterprises remain informal, small, and with low performance due to the formidable challenges. Previous efforts that attempt to understand what challenges/determines the enterprises' performance considerably focus on factors outside the MSMEs, with little attention to the enterprises' own market and customer-focused activities. The current study examines the effect of the enterprises' market-focused activities on their performance by conducting an empirical survey among 150 enterprises in Ethiopia. Apart from extending the conceptualization on the role of market-focused practices, the study suggests development workers complement their efforts by backing the enterprises' endeavours to entirely focus on customer-centric operations.
... The VA model is often extended to include student sociodemographic characteristics since these also vary across schools at intake, are similarly argued beyond the control of the school, and predict current achievement over and above prior achievement (Ballou et al., 2004;Leckie & Goldstein, 2019;Raudenbush & Willms, 1995). The resulting models have been referred to as both "contextualised value-added" (CVA) and "Type A" models. ...
Article
Full-text available
School accountability systems increasingly hold schools to account for their performances using value-added models purporting to measure the effects of schools on student learning. The most common approach is to fit a linear regression of student current achievement on student prior achievement, where the school effects are the school means of the predicted residuals. In the literature, further adjustments are usually made for student sociodemographics and sometimes school composition and “non-malleable” characteristics. However, accountability systems typically make fewer adjustments: for transparency to end users, because data are unavailable or of insufficient quality, or for ideological reasons. There is therefore considerable interest in understanding the extent to which simpler models give similar school effects to more theoretically justified but complex models. We explore these issues via a case study and empirical analysis of England’s “Progress 8” secondary school accountability system.
... Za určitých podmínek však nemusí být jejich přidání potřeba, např. když model obsahuje větší počet předchozích výsledků z různých let nebo předmětů (Ballou, Sanders & Wright, 2004;Koedel et al., 2015). Z praktických důvodů je účelné použít co nejjednodušší model, pokud je "dostatečně dobrý", tj. ...
Article
Full-text available
Cílem práce je na základě porovnání situace ve čtyřech evropských vzdělávacích systémech (Anglie, Francie, Norsko, Polsko) poukázat na možnosti, ale i rizika spojená s evaluací škol na základě přidané hodnoty. Metody. Přístupy jednotlivých zemí jsou porovnány podle předem určeného souboru kritérií. Zdrojem dat je technická dokumentace o modelech přidané hodnoty, webové portály pro zveřejňování výsledků a texty o vývoji modelů a využívání dat o přidané hodnotě. Výsledky. Preferovaným zdrojem dat pro výpočet přidané hodnoty jsou standardizované testy či zkoušky. Dostupnost dat ovlivňuje volbu vzdělávacího stupně, hodnocených vzdělávacích oblastí (předmětů) i zařazení kontextových proměnných. Při zveřejňování přidané hodnoty země brání otevřenému srovnávání škol, netvoří žebříčky a přiznávají statistickou nejistotu spojenou s měřením. Informace o přidané hodnotě slouží jako zpětná vazba školám. Závěry. Indikátory přidané hodnoty jsou spravedlivým ukazatelem vzdělávacího přínosu škol, s jejich zaváděním jsou však spojena rizika jako vynechání důležitých proměnných kvůli nedostupnosti dat, nedostatek odborných kapacit pro implementaci a rozvoj metody a malá využitelnost pro zlepšování kvality vzdělávání.
... This finding is robust in all three models examined, and it is in line with the literature. Studies reveal that the variance among student performance is mostly explained by their past performance (Ballou et al., 2004;BenDavid-Hadar, 2018;Boyd et al., 2008;Goldschmidt et al., 2005;Jerrim et al., 2020;Ladd and Walsh, 2002;Lissitz et al., 2006;Meyer and Christian, 2008;OECD, 2008;Ray, 2006;Ray et al., 2008;Sun et al., 2017). ...
Article
Full-text available
We examined the relationship between the school principal's leadership style, as perceived by the school teachers, and improvement in the performance of students with special education needs enrolled in specialized schools for students with conduct disorders. Our motivation originates in the increasing trend in their share within the general population and the premise that this unique population may respond differently to school principal leadership style. Datasets on students’ previous performance, students’ background characteristics, teacher profiles, and school features were collected. In addition, a questionnaire on teachers’ perceptions of their school principal's leadership style was distributed. Datasets were collected from 92 teachers who worked in special education needs public schools that specialized in conduct disorders. Using STATA software, we measured multilevel fixed-effects models. We found that the more the school principal is perceived as a transformational leader, the higher the students’ performance. Additionally, secondary school advantaged students (i.e. having a high level of previous performance, high socioeconomic strata), who are taught by more educated teachers, exhibit higher performance compared with their counterparts. Based on our finding, we recommend that policy makers would consider assigning transformational leaders to low-performing schools. In addition, policy makers may want to allocate extra learning resources and to provide access to learning services to support the disadvantaged students’ learning process.
... Studies such as Hill, Kapitula, and Umland's (2011) offer evidence that value-added scores relate positively to measures of teachers' content knowledge and their quality of instruction. Further, Meyer (1997) and Ballou, Sanders, and Wright (2004) contend that school-level value-added measurement is a significant advance in comparing schools, particularly over school performance average and median test scores. Other measures such as average or median test scores can be strongly influenced by students' previous achievement, student mobility, and other nonschool factors. ...
Article
Background/Context Modest gains in NAEP scores by American high schools over the past twenty years highlight the need to identify different factors associated with gains in student achievement. Amongst those potential factors is school leadership; limited research on leaders’ work in secondary schools highlights the need to understand how high school leaders structure their schools to promote student learning. Purpose/Objective/Research Question We ask the question, What distinguishes leaders’ practices in more effective high schools from those in less effective high schools that serve large proportions of at-risk youth? Research Design We first identify more and less effective high schools using value-added scores, and we analyze interview, observational, and survey data collected in these schools to compare and contrast how leaders support key practices and organizational routines by their staff. Our analyses include work by traditional leaders (principals and assistant principals) as well as other leaders’ (e.g. department chairs, teacher leaders) practices within the schools. Conclusions/Recommendations We found differences between higher and lower value-added schools in terms of leaders’ conceptions of the intended routines (those ideal policies that faculty are to carry out) and their attention to the implementation of them, through closer examination of faculty members’ actual actions or their directed support for faculty members’ practices. Two primary themes characterize the differences in their practices. First, leaders in higher value-added high schools are more involved in, intentional about, and attentive to how their ideal/intended routines are implemented, thus ensuring that teachers’ actual practices are changed. They focus on how these routines provide ongoing monitoring and feedback for their faculty to build and improve teachers’ quality instruction, alignment of curriculum, and systems of support for students. Second, higher value-added school leaders provided more targeted, systemic efforts to support personalized learning for students.
... The items in this section asked respondents whether VAM scores are generally free from bias or are confounded by other sources of variance. Additional items asked about some specific sources of bias discussed in the VAM literature, such as nonrandom assignment of teachers and students to classrooms (Condie et al., 2014;Everson et al., 2013;Johnson et al., 2015;Rothstein, 2009), the inclusion or exclusion of student-level covariates in the model (Ballou et al., 2004;McCaffrey et al., 2004), and the effects of missing data (Amrein-Beardsley, 2008). ...
Article
Full-text available
Background/Context The Race to the Top federal initiatives and requirements surrounding waivers of No Child Left Behind promoted expanded use of value-added models (VAMs) to evaluate teachers. Even after passage of the Every Student Succeeds Act (ESSA) relaxed these requirements, allowing more flexibility and local control, many states and districts continue to use VAMs in teacher evaluation systems, suggesting that they consider VAMs a valid measure of teacher effectiveness. Scholars in the fields of economics, education, and quantitative methods continue to debate several aspects of VAMs’ validity for this purpose, however. Purpose The purpose of this study was to directly ask the most experienced VAM scholars about validity of VAM use in teacher evaluation based on the aspects of validity described in the Standards for Educational and Psychological Testing and found in a review of high-quality peer-reviewed literature on VAMs. Participants We invited the 115 scholars listed as an author or coauthor of one or more of the 145 articles published on evaluating teachers with VAMs that have been published in prominent peer-reviewed journals between 2002 and implementation of ESSA in 2016. In this article, we analyze data from 36 respondents (12 economists, 13 educators, and 11 methodologists) who rated themselves as “experienced scholars,” “experts,” or “leading experts” on VAMs. Research Design This article reports both quantitative and qualitative analyses of a survey questionnaire completed by experienced VAM scholars. Findings Analyses of 44 Likert-scale items indicate that respondents were generally neutral or mixed toward the use of VAMs in teacher evaluation, though responses from educational researchers were more critical of VAM use than were responses from economists and quantitative methodologists. Qualitative analysis of free response comments suggests that participants oppose exclusive or high-stakes use of VAMs but are more supportive of their use as a component of evaluation systems that use multiple measures. Conclusions These findings suggest that scholars and stakeholders from different disciplines and backgrounds think about VAMs and VAM use differently. We argue that it is important to understand and address stakeholders’ multiple perspectives to find the common ground on which to build consensus.
... We also measure student demographic characteristics, such as gender, race/ethnicity, free-and-reduced lunch status, time between testing administrations, and grade level. We also include students' prior test scores, which help further reduce selection bias in the estimates (Ballou, Sanders, & Wright, 2004). ...
Article
Background Although we have learned a good deal from lottery-based and quasi-experimental studies of charter schools, much of what goes on inside of charter schools remains a “black box” to be unpacked. Grounding our work in neoclassical market theory and institutional theory, we examine differences in the social organization of schools and classrooms to enrich our understanding of school choice, school organizational and instructional conditions, and student learning. Purpose/Objective/Research Question/Focus of Study Our study examines differences in students’ mathematics achievement gains between charter and traditional public schools, focusing on the distribution and organization of students into ability groups. In short, we ask: (1) How does the distribution of ability grouping differ between charter and traditional public schools? And (2) What are the relationships between ability group placement and students’ mathematics achievement gains in charter and traditional public schools? Research Design With a matched sample of charter and traditional public schools in six states (Colorado, Delaware, Indiana, Michigan, Minnesota, and Ohio), we use regression analyses to estimate the relationship between student achievement gains and school sector. We analyze how ability grouping mediates this main effect, controlling for various student, classroom, and school characteristics. Findings We find significant differences in the distribution of students across ability groups, with a more even distribution in charter compared to traditional public schools, which appear to have more selective placements for high groups. Consistent with prior research on tracking, we also find low-grouped students to be at a significant disadvantage when compared with high- and mixed-group peers in both sectors. Conclusions Although we find some significant differences between ability group placement and student achievement gains in mathematics, these relationships do not differ as much by sector as market theory (with its emphasis on innovation and autonomy) would predict. Consistent with institutional theory, both sectors still group students by ability and have similar relationships between gains and grouping.
... Such differences are considered beyond the control of the school. Student sociodemographic characteristics are often added to more convincingly adjust for the nonrandom selection of students into schools (Ballou et al., 2004;Leckie and Goldstein, 2019). ...
Preprint
Full-text available
School value-added models are widely applied to study the effects of schools on student achievement and to monitor and hold schools to account for their performances. The traditional model is a multilevel linear regression of student current achievement on student prior achievement, background characteristics, and a school random intercept effect. The predicted random effect aims to measure the mean academic progress students make in each school. In this article, we argue that much is to be gained by additionally studying the variance in student progress in each school. We therefore extend the traditional model to allow the residual variance to vary as a log-linear function of the student covariates and a new school random effect to predict the influence of schools on the variance in student progress. We illustrate this new model with an application to schools in London. Our results show the variance in student progress varies substantially across schools - even after adjusting for differences in the variance in student progress associated with different student groups - and that this variation is predicted by school characteristics. We discuss the implications of our work for research and school accountability.
... The classrooms' EB estimates can also be directly obtained from a value-added multilevel model, where classroom effects are estimated as random intercepts (Ballou, Sanders and Wright, 2004;Rabe-Hesketh and Skrondal, 2012;Guarino, Reckase and Wooldridge, 2015). ...
... The VA model is often extended to include student sociodemographic characteristics since these also vary across schools at intake, are similarly argued beyond the control of the school, and predict current achievement over and above prior achievement (Ballou et al., 2004;Leckie and Goldstein, 2019;Raudenbush & Willms, 1995). The resulting models have been referred to as both 'contextualised value-added' (CVA) and 'Type A' models. ...
Preprint
Full-text available
School accountability systems increasingly hold schools to account for their performances using value-added models purporting to measure the effect of schools on student learning. The most common approach is to fit a linear regression of student current achievement on student prior achievement, where the school effects are the school means of the predicted residuals. In the literature further adjustments are made for student sociodemographics and sometimes school composition and 'non-malleable' characteristics. However, accountability systems typically make fewer adjustments: for transparency to end users, because data is unavailable or of insufficient quality, or for ideological reasons. There is therefore considerable interest in understanding the extent to which simpler models give similar school effects to more theoretically justified but complex models. We explore these issues via a case study and empirical analysis of England's 'Progress 8' secondary school accountability system.
... D. Ballou, W. Sanders, P. Wright (2004); OECD (2012) pabrėžia, kad vienas pagrindinių įsivertinimo aspektų yra mokinių motyvacijos ir lyderystės raiška ugdymo organizacijoje. Tai labai svarbus pokytis paskutiniuose Vakarų šalių valstybinėse strategijose, nes jose pradėti siekti mokyklos vadovų, mokytojų ir mokinių lyderystės. ...
Article
This research deals with the influence of speed reading to primary school pupils' leadership. This topic is relevant for schools constantly ongoing changes that are associated with modern society, therefore teaching Lithuanian language is a very difficult job that requires patience and good didactic knowledge. Child’s education in today’s busy, complex and constantly changing world is also controversial, because qualities such as initiative, ability to work with others, openness, communication skills, flexibility and adaptability become especially important. In this case, leadership is the basic skill that you need to develop instead of suppressing at the early age. The present era requires necessary provisions. This particular period requires modernization of training institutions as well as the shift to a more integrated work and effort to provide a more personalized teaching and learning strategies. One of these indicators is speed reading techniques and instruments. The term “speed reading” is not very frequent, but this is the main learning strategy. Reading should be understood as an ongoing and constantly developing ability. Speed Reading improves cognitive powers, sharpens critical thinking skills and enhances problem-solving skills. The paper reveals a close primary school students' leadership and speed reading interface. Subject – the influence of speed reading seeking to encourage students‘ leadership. Main research questions: 1. How speed reading methodologies can promote student’s leadership? 2. What kind of influence for educating the reading skills of the classroom kids has the general level of reading achievement? Summarizing the analysis of the research data, it can be argued that the ability to read children with a leadership character is advanced because they are also cultivated in the family even before they go to school. All subjects have an early reading comprehension in the family. A behavioural change has been reported in a vibrant expression of leadership: high motivation, autonomy, operational planning, help each other. Speed reading techniques: hypertext, text linking colors Aurasma program is innovative, motivating children to operate independently and helps maintain a higher concentration of time, the less distracting. It is also noted that speed reading instruments, which are systematically adjusted according to the reader's age and reading target may be treated as curriculum differentiation, higher thinking skills training. It is also noted that the tools for rapid reading, which are systematically matched according to the reader's age and the purpose of reading, can help to differentiate the content of education, to develop students' advanced thinking abilities. The methodology of speed reading helps to develop children's leadership, which is manifested in: cooperation, democracy, responsibility, planning of activities, and responsible learning. Foreign researchers' perceptions coincide with the study's conclusion that quick reading motivates students, helps them plan activities, and develop creativity in the search for information Keywords: leadership, speed reading methodology, primary school education.
... (Sanders & Horn, 1994, 1998Sanders, Saxton, & Horn, 1997;Webster & Mendro, 1997;Kupermintz, 2003). With the effect of No Child Left Behind Act of 2001 (NCLB) based on the belief that the most important school-related factor affecting student achievement is the quality of the teacher (Aaronson, Barrow, & Sander, 2007;Rivkin, Hanushek, & Kain, 2005), the studies done in this field gained momentum (Ballou, Sanders, & Wright, 2004;McCaffrey, Lockwood, Koretz, Louris, & Hamilton, 2004;Newton, Darling-Hammond, Haertel, & Thomas, 2010). In order to measure the effectiveness of teachers, several types of VAMs were developed and applied by states and school districts in the US such as the Tennessee value-added assessment system (TVAAS) and the Dallas value-added accountability system (DVAAS). ...
Conference Paper
Full-text available
This article provides evidence by undertaking a systematic review on the stability problem of using value-added models in teacher effectiveness estimates from the perspective of the impact of the number of previous test scores employed aimed at answering a unique review question: How stable is teacher effectiveness estimates measured by VAMs? By using the terms: teacher performance, student performance, value-added model, stability and their other related synonyms, a comprehensive search was conducted in 17 databases along with employing hand search in Google Scholar and contacting authorised persons by email. In total 1439 records were found as a result of the searches. After completing the screening process, 50 studies remained for data extraction. Out of 50 a total of studies in the review list, 13 focused on the stability of VAM estimates regarding using the number of prior test scores. In summary, there is a common view that the use of prior year data in on value-added estimates for teacher effectiveness has a positive impact, however, with regard to the impact of multiple previous year data, different voices arose from the researchers.
... Compositional effects are central to many new developments in the field of school and teacher effectiveness studies (Ballou, Sanders, & Wright, 2004;Guldemond & Bosker, 2009;Van de Grift, 2009;Verachtert, Van Damme, Onghena, & Ghesquière, 2009). The methodology that we propose has potential to be widely applicable in the field of education, in investigating the effects of continuous variables aggregated at a higher level (e.g., the student or the classroom) on individual-level outcomes. ...
Article
School value-added models are widely applied to study, monitor, and hold schools to account for school differences in student learning. The traditional model is a mixed-effects linear regression of student current achievement on student prior achievement, background characteristics, and a school random intercept effect. The latter is referred to as the school value-added score and measures the mean student covariate-adjusted achievement in each school. In this article, we argue that further insights may be gained by additionally studying the variance in this quantity in each school. These include the ability to identify both individual schools and school types that exhibit unusually high or low variability in student achievement, even after accounting for differences in student intakes. We explore and illustrate how this can be done via fitting mixed-effects location scale versions of the traditional school value-added model. We discuss the implications of our work for research and school accountability systems.
Article
We aim to estimate school value-added dynamically in time. Our principal motivation for doing so is to establish school effectiveness persistence while taking into account the temporal dependence that typically exists in school performance from one year to the next. We propose two methods of incorporating temporal dependence in value-added models. In the first we model the random school effects that are commonly present in value-added models with an auto-regressive process. In the second approach, we incorporate dependence in value-added estimators by modeling the performance of one cohort based on the previous cohort’s performance. An identification analysis allows us to make explicit the meaning of the corresponding value-added indicators: based on these meanings, we show that each model is useful for monitoring specific aspects of school persistence. Furthermore, we carefully detail how value-added can be estimated over time. We show through simulations that ignoring temporal dependence when it exists results in diminished efficiency in value-added estimation while incorporating it results in improved estimation (even when temporal dependence is weak). Finally, we illustrate the methodology by considering two cohorts from Chile’s national standardized test in mathematics.
Article
This mixed methods study compares how secondary school teachers implemented the 12-year Basic Education Reform in Taiwan in 2011 and 2021. Major sources of were surveys from sample teachers and students in sixteen sample schools. The survey asked how often a teaching or evaluation strategy was used. The conclusions of both studies indicate that teacher-directed lessons (teacher talk and questioning) dominated. The use of student-centred learning (SCL) methods (activities and group work) was limited. However, almost 30% of the sample students in both studies stated that the behaviour of students in the classroom was affecting their work. A major obstacle for the reform remains high-stakes examinations, which rely heavily on rote memorization, rather than the creative application of knowledge. Educators in all jurisdictions can learn from the reform efforts.
Article
Full-text available
Evaluating cognitive ability and skills development in higher education presents challenges because the focal outcome often serves as, or at least is reflected in, selection criteria for college admission decisions. Additionally, the baseline abilities and skills gained from earlier educational experiences impact students' further learning outcomes. To address these challenges, we propose an integrated framework for estimating students' growth that accounts for lasting transition effects and varying selection effects. Using a nationally representative longitudinal sample of Chinese university students, we found that students experienced medium-sized growth (0.67) in critical thinking skills, with a large effect size observed for evaluating the reasoning of an argument, a medium effect size for evaluating argument implications, and no significant growth in evaluating argument credibility. Our estimates of transition effects showed that the impact of earlier education experience on students' critical thinking growth diminished in the college years. After controlling students' socioeconomic status and the college entrance examination scores, we found that the within-university advantage in the college entrance examination served as a factor in explaining the university impacts on students’ critical thinking. Our findings provide rich insights into improving the evaluation of critical thinking skills in Chinese universities, and our illustrative framework can also be applied to similar education contexts.
Preprint
Full-text available
This study investigated the acceptability and feasibility of estimating reliable and valid Value-Added Models (VAMs) in England for the evaluation of school interventions and different teacher training routes. Datasets already available in English primary and secondary schools were explored to examine whether they could inform and support the evaluation of school interventions and different training routes. The originality of this study also lies in the use of end-of-year assessments and data which are already available and collected by schools in England. Furthermore, this study suggests a three-step assessment quality check framework for researchers and school data managers, which could be implemented before proceeding with any analysis of the school assessment data. The key finding of this study is that it is acceptable and feasible to estimate aggregated teacher impact with VAMs for evaluation of interventions and policies. This suggests that it is possible to maximise the use of available school data for evaluating the effectiveness of different interventions to support evidence-based decision-making.
Article
One of the issues in value-added assessment of teachers is the persistence of effects. Although Generalized Persistence (GP) model addresses this issue, it is not commonly used because of the difficulty interpreting its results. The purpose of this study is to examine teachers’ value-added scores using the GP model and interpret the findings in terms of teacher assessment. Test scores of the same students (1432 students studying in 12 different middle schools) for three consecutive test administrations were used to estimate value-added scores of 61 science and 42 Turkish teachers. While correlation between current and future year value-added scores was high for science teachers (r = 0.88), it was lower for Turkish teachers (r = 0.46). On the other hand, the reliability of the estimates was found to be low. The study is concluded with suggestions for the use of the GP model and implications for further research and policy use.
Article
Full-text available
This article is a conceptual paper that describes different approaches for teaching mathematics to secondary school students that blend learning with the intrinsic motivation that comes from their reactive intelligence. Students ultimately study mathematics after "learning math." The level of the children's mathematical knowledge is revealed by their ability to read a text and recognize symbols, including numbers, and visuals like diagrams and graphs. Adolescents use their diverse skill sets to master mathematics in these different contexts at the secondary level. Here it is found that the students often use the knowledge from their daily lived experiences (from direct and indirect engagements) to assist them with the mathematics in the textbook and used conceptual understanding rather than formal procedures in response to the mathematics textbook. Stimulating questions, analyzing problems, successful responses, and solving techniques are significant motivating tools in mathematics, which is especially beneficial in the framework of Mathematics learning. These tools and techniques are often used at the secondary to teach mathematics.
Article
It is a much-lamented fact that research with the potential to inform or influence education policy instead remains policy inert. There are many reasons for this frustrating state of affairs, including a lack of strategic thinking on the part of researchers on how to successfully accomplish outreach—as opposed to communication with peers (in-reach). Another, and a principal focus of this article, is the failure of researchers to appreciate the power of employing compelling narratives to bring their findings to the attention of policymakers and other stakeholders. Accordingly, this article presents some examples of narratives specifically designed for outreach and discusses some of their features. It also considers the challenges in gaining traction with counternarratives once a particular narrative has achieved currency. Researchers should also be mindful of the tenor of the times, with experts now often viewed with skepticism, if not downright hostility. In some quarters, excessive reliance on technocrats is even seen as a threat to democratic governance. The article concludes with some recommendations on how to appropriately enhance the role of research in education policymaking.
Chapter
Teacher evaluation has been a considerable focus across many countries around the globe. The concern for how teachers are evaluated is largely predicated on decades of research that suggests high-quality teaching is crucial for student learning. In attempt to improve teaching quality, education policymakers have advanced various reforms to systems designed to evaluating teacher performance. This chapter considers the case of the teacher evaluation reform in Tennessee. In 2011, the Tennessee state legislature voted to implement a number of changes to the teacher evaluation process, which—similar to many other states at the time within the United States—were enacted in attempt to compete for federal education grants under the Race to the Top grant competition. This chapter offers an in-depth recounting of the formulation, passage, and implementation of Tennessee’s statewide teacher evaluation reforms. In addition, the chapter outlines the major design elements of the reformed teacher evaluation system, its underlying theory of action, and its impact as documented in the extant research base.
Article
Advocates of teacher value-added modelling (VAM) argue that this technique can provide evidence on teacher effectiveness to inform teacher policies and broader education system reforms. Critics contend that value-added is a poor proxy for teacher quality and as such is of questionable utility, especially where teacher accountability is concerned. In low- and middle-income countries, and especially sub-Saharan Africa, where the challenge of the ‘learning crisis’ is most severe, a lack of longitudinal data has precluded extensive debate on the matter. In this paper we explore the potential of value-added analysis for diagnostic purposes in the context of Ethiopia. We make use of data from the Young Lives longitudinal study – specifically two rounds of school surveys conducted in Ethiopia between 2012 and 2017 when pupils were in grades 4–8. Learning levels in the Young Lives sites in Ethiopia are very considerably below curricular expectations. Like many countries in sub-Saharan Africa, Ethiopia faces a significant challenge in terms of a ‘learning crisis’ and in terms of the attendant need to develop policies to improve educational effectiveness within the confines of very limited resources. We discuss the background to VAM models and their use, including in relation to the context of Ethiopia. The paper shows that learning progress in primary schools varies widely between classrooms, and between pupils within the same classroom. Some schools and teachers are more successful in raising overall attainment by ‘raising the floor’ of learning and narrowing the dispersion. Others are more successful by ‘raising the roof’. Less effective teachers appear to be particularly ineffective for pupils with higher scores at the start of the year. In contrast, the most effective teachers showed high levels of ‘value-added’ for pupils at all levels of prior performance. Diagnostic analysis of teacher value-added has potential, we argue, to aid understanding of contributors to low levels of learning such as: (i) over-ambitious curricula; (ii) absence of ‘teaching at the right level’; (iii) within class heterogeneity and pupil grouping strategies; and (iv) teaching and learning strategies – such as ‘differentiation’ or ‘mastery’.
Article
We apply “value-added” models to estimate the effects of teachers on an outcome they cannot plausibly affect: student height. When fitting the relatively simple models that are widely used in educational practice to New York City data, we find the standard deviation of teacher effects on height is nearly as large as that for math and reading, raising potential concerns about value-added estimates of teacher effectiveness. We consider two explanations: nonrandom sorting of students to teachers and idiosyncratic classroom-level variation. We cannot rule out sorting on unobservables, but do not find that students are sorted to teachers based on lagged height. The correlation in teacher effects estimates on height across years and the correlation between teacher effects on height and teacher effects on achievement are insignificant. The large estimated “effects” for height appear to be driven by year-to-year classroom-by-teacher variation that isn't often separable from true effects in models commonly estimated in practice. Reassuringly for use of these models in research settings, models which disentangle persistent effects from transient classroom-level variation yield the theoretically expected effects of zero for teacher value added on height.
Article
Full-text available
The present study examined the 2015 Trends in International Mathematics and Science Study “Advanced” data to examine how the educational credentials of maths teachers and other teacher characteristics were related to attitude towards advanced mathematics and perceptions of engaged teaching among 12th-grade students enrolled in advanced mathematics courses in the U.S. As attitudinal outcomes in this study, two measures of attitude towards mathematics were employed – the Students Like Learning Advanced Mathematics scale and the Students Value Advanced Mathematics scale, and one measure of student perception of engaged teaching – the Students’ Views on Engaging Teaching in Advanced Mathematics Lessons. A set of multilevel regression analyses were conducted predicting each of these aforementioned outcomes. No statistically significant effects on the attitudinal outcomes were observed for teacher variables. Positive effects were noted for parental education on students’ valuing of advanced mathematics. A prominent finding was that higher levels of parental education were associated with higher student levels of valuing mathematics, which likely reflects a family/home culture that implicitly or explicitly places high value on science and mathematics. Identifying factors that might facilitate positive attitude is important to increase the likelihood that students will choose, and be retained in, mathematics and STEM education and careers.
Article
Full-text available
Abstract Almost,every ,state has in place ,a state ,assessment ,and ,accountability ,system. These systems vary greatly in their characteristics, but share a common global purpose of improving,teaching ,and ,learning. Some ,of the ,variations ,in the ,state systems ,are discussed,and illustrated with examples,from selected states. Issues that are critical to the value and interpretation of results such as the use, if any, of comparisons among schools that serve students who come from different socioeconomic backgrounds, the relative weight given to current status or to improvement, and the basis for judging improvements at the school level (i.e., cross-sectional comparisons, quasi-longitudinal,
Article
Full-text available
The assessment of school effectiveness in educational research studies is considered from the viewpoint of statistical modelling. A variety of models are applied to a set of data on 907 pupils in 18 schools from one Local Education Authority. We argue for the general use of variance component or "random parameter" models for the analysis of such studies involving clustered observations. For the data examined, the model which regresses school mean outcome on school mean intake (i.e the "means on means" model) is shown to give estimated school effects considerably different from those produced by other models. In the light of our results, we comment on several recent large-scale British studies of school effectiveness.
Article
Full-text available
The use of complex value-added models that attempt to isolate the contributions of teachers or schools to student development is increasing. Several variations on these models are being applied in the research literature, and policy makers have expressed interest in using these models for evaluating teachers and schools. In this article, we present a general multivariate, longitudinal mixed-model that incorporates the complex grouping structures inherent to longitudinal student data linked to teachers. We summarize the principal existing modeling approaches, show how these approaches are special cases of the proposed model, and discuss possible extensions to model more complex data structures. We present simulation and analytical results that clarify the interplay between estimated teacher effects and repeated outcomes on students over time. We also explore the potential impact of model misspecifications, including missing student covariates and assumptions about the accumulation of teacher effects over time, on key inferences made from the models. We conclude that mixed models that account for student correlation over time are reasonably robust to such misspecifications when all the schools in the sample serve similar student populations. However, student characteristics are likely to confound estimated teacher effects when schools serve distinctly different populations.
Article
The Tennessee Value-Added Assessment System (TVAAS) has been designed to use statistical mixed-model methodologies to conduct multivariate, longitudinal analyses of student achievement to make estimates of school, class size, teacher, and other effects. This study examined the relative magnitude of teacher effects on student achievement while simultaneously considering the influences of intraclassroom heterogeneity, student achievement level, and class size on academic growth. The results show that teacher effects are dominant factors affecting student academic gain and that the classroom context variables of heterogeneity among students and class sizes have relatively little influence on academic gain. Thus, a major conclusion is that teachers make a difference. Implications of the findings for teacher evaluation and future research are discussed.
Article
The increasing public demand to hold schools accountable for their effects on student outcomes lends urgency to the task of clarifying statistical issues pertaining to studies of school effects. This article considers the specification and estimation of school effects, the variability of effects across schools, and the proportion of variation in student outcomes attributable to differences in school context and practice. We present a statistical model that defines two different types of school effect: one appropriate for parents choosing schools for their children, the second for agencies evaluating school practice. Studies of both types of effect are viewed as quasi-experiments posing formidable obstacles to valid causal inference. A multilevel decomposition of variance within and between schools has important and perhaps counterintuitive implications for school evaluation. The potential for unbiased estimation depends on the type of effect under consideration because the two types of school effect have markedly different data requirements. Commonly used estimators of each effect are shown to be biased and, in some cases, inconsistent. Analyses of survey data from Scotland illustrate the recommended techniques. We conclude with a brief discussion of the role of school evaluation in a broader agenda of research in support of school improvement.
The moth and the flame: Student learning as a criterion of instructional competence Grading teachers, grading schools. Is student achieve-Raudenbush The estimation of school effects
  • J Popham
Popham, J. (1997) The moth and the flame: Student learning as a criterion of instructional competence. In J. Millman, (Ed.), Grading teachers, grading schools. Is student achieve-Raudenbush, S. W., & Willms, J. D. (1995). The estimation of school effects. Journal of Educational and Behavioral Statistics, 20(4), 307–335.
The Tennessee Value-Added Assess-ment System: A quantitative, outcomes-based approach to educational assessment Grading teachers, grading schools. Is student achievement a valid evaluation measure? (pp Cumulative and residual effects of teachers on future student academic achievement
  • W L Sanders
  • A M Saxton
  • S P Horn
  • Corwin
  • W L Sanders
  • J C Rivers
Sanders, W. L., Saxton, A. M., Horn, S. P. (1997). The Tennessee Value-Added Assess-ment System: A quantitative, outcomes-based approach to educational assessment. In J. Millman, (Ed.), Grading teachers, grading schools. Is student achievement a valid evaluation measure? (pp. 137–162). Thousand Oaks, CA: Corwin. Sanders, W. L., & Rivers, J. C. (1996). Cumulative and residual effects of teachers on future student academic achievement. University of Tennessee Value-Added Research and Assessment Center.
The Dallas Value-Added Accountability System Grading teachers, grading schools. Is student achievement a valid evaluation measure?
  • W J Webster
  • R L Mendro
Webster, W. J., & Mendro, R. L. (1997). The Dallas Value-Added Accountability System. In J. Millman, (Ed.), Grading teachers, grading schools. Is student achievement a valid evaluation measure? (pp. 81–99). Thousand Oaks, CA: Corwin.
When a test fails the schools, careers and reputations suffer Prototype analysis of school effects. Value-Added Research Consortium Measuring gains in student achievement: A feasibility study
  • J Steinberg
  • D Henriques
Steinberg, J., & Henriques, D. (2001). When a test fails the schools, careers and reputations suffer. New York Times, May 21, 2001. Retrieved October 17, 2002, from http://query.nytimes.com/search/restricted/. University of Florida. (2000a). Prototype analysis of school effects. Value-Added Research Consortium. University of Florida. (2000b). Measuring gains in student achievement: A feasibility study. Value-Added Research Consortium.
Value-Added Research and Assessment, SAS Institute, Inc., 100 SAS Campus Drive, Cary, NC 27513-8617; Bill.Sanders@sas.com. His areas of specialization are statistical mixed linear models
  • William Sanders
  • Senior Research
  • Fellow
WILLIAM SANDERS is Senior Research Fellow, Value-Added Research and Assessment, SAS Institute, Inc., 100 SAS Campus Drive, Cary, NC 27513-8617; Bill.Sanders@sas.com. His areas of specialization are statistical mixed linear models.
Measurement error in free and reduced-price lunch
  • D Ballou
  • P Wright
Ballou, D., & Wright, P. (2003). Measurement error in free and reduced-price lunch. Unpublished manuscript.
Prototype analysis of school effects. Value-Added Research Consortium
  • Florida University
University of Florida. (2000a). Prototype analysis of school effects. Value-Added Research Consortium.
Grading teachers, grading schools. Is student achievement a valid evaluation measure?
  • In J Millman
In J. Millman, (Ed.), Grading teachers, grading schools. Is student achievement a valid evaluation measure? (pp. 81-99). Thousand Oaks, CA: Corwin.
When a test fails the schools, careers and reputations suffer
  • J Steinberg
  • D Henriques
Steinberg, J., & Henriques, D. (2001). When a test fails the schools, careers and reputations suffer. New York Times, May 21, 2001. Retrieved October 17, 2002, from http://query .nytimes.com/search/restricted/.
Toward what end? The evaluation of student learning for the improvement of teaching Grading teachers, grading schools. Is student achievement a valid evaluation measure?
  • L Darling-Hammond
Darling-Hammond, L. (1997). Toward what end? The evaluation of student learning for the improvement of teaching. In J. Millman, (Ed.), Grading teachers, grading schools. Is student achievement a valid evaluation measure? (pp. 248-263). Thousand Oaks, CA: Corwin.