Rebecca Zwick’s research while affiliated with Educational Testing Service and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (78)


Using an Index of Admission Obstacles with Constrained Optimization to Increase the Diversity of College Classes
  • Article

November 2020

·

30 Reads

·

3 Citations

Educational Assessment

Rebecca Zwick

·

Andrew Blatter

·

Lei Ye

·

Today, postsecondary institutions in the US typically wish to enroll entering classes that are both academically qualified and diverse. Although the definition of diversity varies from school to school, the challenge is essentially the same: How can academic objectives be combined with goals that involve the composition of the entering class? Many schools have a commitment to facilitating access for under-represented minorities or low-income applicants, or for members of nearby communities. Incorporating these goals while maintaining academic standards can be challenging.



Fairness in Measurement and Selection: Statistical, Philosophical, and Public Perspectives

November 2019

·

77 Reads

·

16 Citations

Educational Measurement Issues and Practice

Selection decisions have a major impact on our education, occupation, and quality of life, and the role of standardized tests in selection has always been a source of controversy. Here, I consider various definitions of fairness in measurement and selection—those emerging from within educational measurement and statistics, those from philosophy, and finally, those from the public. I use examples of public challenges to selection practices to illustrate the fact that technical and philosophical definitions of fairness do not align well with public concerns. I emphasize the importance of promoting awareness of existing standards, advocating for the fair use of testing and selection practices, and communicating in a candid and straightforward way when engaging with test takers and test users.


Using Constrained Optimization to Increase the Representation of Students from Low-Income Neighborhoods

October 2019

·

61 Reads

·

7 Citations

Applied Measurement in Education

In US colleges, the scarcity of students from low-income families is a major concern. We present a novel way of boosting the percentage of qualified low-income students using constrained optimization (CO), an operations research technique. CO allows incorporation of both academic requirements and diversity goals in college admissions. The incoming class’s academic credentials are maximized while constraints on class composition are imposed. In particular, the percentage of students in a certain demographic group can be required to exceed a minimum. In an illustrative analysis, we show how CO can be used to increase the proportion of admitted students from low-income neighborhoods.


Assessment in American Higher Education: The Role of Admissions Tests

May 2019

·

409 Reads

·

39 Citations

The Annals of the American Academy of Political and Social Science

In this article, I review the role of college admissions tests in the United States and consider the fairness issues surrounding their use. The two main tests are the SAT, first administered in 1926, and the ACT, first given in 1959. Scores on these tests have been shown to contribute to the prediction of college performance, but their role in the admissions process varies widely across colleges. Although test scores are consistently listed as one of the most important admissions factors in national surveys of postsecondary institutions, an increasing number of schools have adopted “test-optional” policies. At these institutions, test score requirements are seen as a barrier to campus diversity because of the large performance gaps among ethnic and socioeconomic groups. Fortunately, the decentralized higher education system in the United States can accommodate a wide range of admissions policies. It is essential, however, that the impact of admissions policy changes be studied and that the resource implications of these changes be thoroughly considered.


Figure 4.1 Mock-Ups of displays used (Clockwise from top: Table, Line, Text, Bar, Projection).
1 Information categories of statements with examples
2 Number of questions of each type for each display.
Communicating Measurement Error Information to Teachers and Parents
  • Chapter
  • Full-text available

August 2018

·

51 Reads

·

1 Citation

Download

Aggregating Polytomous DIF Results Over Multiple Test Administrations: Aggregating Polytomous DIF Results

March 2018

·

23 Reads

Journal of Educational Measurement

In typical differential item functioning (DIF) assessments, an item's DIF status is not influenced by its status in previous test administrations. An item that has shown DIF at multiple administrations may be treated the same way as an item that has shown DIF in only the most recent administration. Therefore, much useful information about the item's functioning is ignored. In earlier work, we developed the Bayesian updating (BU) DIF procedure for dichotomous items and showed how it could be used to formally aggregate DIF results over administrations. More recently, we extended the BU method to the case of polytomously scored items. We conducted an extensive simulation study that included four “administrations” of a test. For the single-administration case, we compared the Bayesian approach to an existing polytomous-DIF procedure. For the multiple-administration case, we compared BU to two non-Bayesian methods of aggregating the polytomous-DIF results over administrations. We concluded that both the BU approach and a simple non-Bayesian method show promise as methods of aggregating polytomous DIF results over administrations.


Exploring the Effectiveness of a Measurement Error Tutorial in Helping Teachers Understand Score Report Results

July 2016

·

36 Reads

·

32 Citations

Educational Assessment

The goal of this study was to explore the effectiveness of a short web-based tutorial in helping teachers to better understand the portrayal of measurement error in test score reports. As described below, the short video tutorial included both verbal and graphical representations of measurement error. Results showed a significant difference in comprehension scores between each of two tutorial groups (basic and enhanced) and the control group (no tutorial), but not between the two tutorial groups. Results also provided evidence of teachers' misconceptions about the meaning of measurement error and confidence bands.


Comparing Graphical and Verbal Representations of Measurement Error In Test Score Reports

May 2014

·

54 Reads

·

32 Citations

Educational Assessment

Research has shown that many educators do not understand the terminology or displays used in test score reports and that measurement error is a particularly challenging concept.We investigated graphical and verbal methods of representing measurement error associated with individual student scores. We created four alternative score reports, each constituting an experimental condition, and randomly assigned them to research participants. We then compared comprehension and preferences across the four conditions. In our main study, we collected data from 148 teachers. For comparison, we studied 98 introductory psychology students. Although we did not detect statistically significant differences across conditions, we found that participants who reported greater comfort with statistics tended to have higher comprehension scores and tended to prefer more informative displays that included variable-width confidence bands for scores. Our data also yielded a wealth of information regarding existing misconceptions about measurement error and about score-reporting conventions.


AN INVESTIGATION OF THE EFFICACY OF CRITERION REFINEMENT PROCEDURES IN MANTEL-HAENSZEL DIF ANALYSIS

June 2013

·

18 Reads

·

2 Citations

ETS Research Report Series

Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. Although it is often assumed that refinement of the matching criterion always provides more accurate DIF results, the actual situation proves to be more complex. To explore the effectiveness of refinement, we conducted a simulation study consisting of 40 conditions that varied in terms of amount and pattern of DIF, sample sizes, and ability distributions. We found that the effectiveness of refinement was heavily dependent on whether DIF was balanced (with positive DIF values compensating for negative DIF values) or unbalanced (all in one direction). In balanced conditions, the unrefined method generally produced better results, whereas, in unbalanced conditions, the opposite was true. In the absence of information about the pattern of DIF, it is probably best to choose the refined method because it is only slightly disadvantageous in balanced conditions, whereas the unrefined method can be substantially disadvantageous in certain unbalanced conditions.


Citations (69)


... Constrained Optimization. Constrained optimization (CO) is a mathematical procedure which allows to build models that ensure HEIs reach diversity goals while upholding excellence (Zwick, 2020). CO originated from operations research: an analytical technique of decision-making. ...

Reference:

Admissions to Graduate Studies: Selection Methods for Life and Natural Sciences Masters’ Programs at a European Research University. Doctoral dissertation.
Using Mathematical Models to Improve Access to Postsecondary Education
  • Citing Chapter
  • January 2020

... Predictive validity usually refers to the relation between test scores and relevant outcomes, such as academic performance (Berry, 2015). What constitutes fairness in selective admissions is contested, but there is consensus that absence of differential prediction is a key requirement (Zwick, 2019). Differential prediction refers to differences between subgroups in regression equations predicting outcomes from admission test scores, which is indicated by differences in regression slopes and/or intercepts (Berry, 2015). ...

Fairness in Measurement and Selection: Statistical, Philosophical, and Public Perspectives
  • Citing Article
  • November 2019

Educational Measurement Issues and Practice

... Selective admissions decisions are challenging by nature, as they profoundly impact an individual's education, careers, and quality of life (Zwick, 2019). Consequently, evidence-based student selection decision making for objective, transparent, and fair selective admissions has been a prominent topic of interest for several decades. ...

Using Constrained Optimization to Increase the Representation of Students from Low-Income Neighborhoods
  • Citing Article
  • October 2019

Applied Measurement in Education

... Unlike entrance tests, screening tests are not necessarily tied to the admission process of a particular program but are often used in broader contexts, such as identifying talent or determining eligibility for specialized training programs. In past studies involving the current five years, there are several studies that are of concern to researchers in the screening test or placement test of students mostly from the country from the United States of America (Bloem et al., 2021;Buzzetto-more & Alade, 2019;Ngo et al., 2021;Ockey et al., 2020;Zwick, 2019), United Kingdom (Silva et al., 2020), Korea (Van BAO & Cho, 2022) and China (Wei, 2020). These studies focus on placement and screening tests for admission to higher education institutions. ...

Assessment in American Higher Education: The Role of Admissions Tests
  • Citing Article
  • May 2019

The Annals of the American Academy of Political and Social Science

... It has been more than 4 years since the publication of Score Reporting Research and Applications (Zapata-Rivera, 2018). This book, part of the National Council for Measurement in Education (NCME) book series, includes work in areas such as validity in score reporting (Tannenbaum, 2018), cognitive affordances of graphical representations (Hegarty, 2018), evaluation of subscores (Sinharay, Puhan, Haberman, & Hambleton, 2018), communicating measurement error information to teachers and parents (Zapata-Rivera, Kannan, & Zwick, 2018), score reporting issues for licensure, certification, and admissions programs (O'Donnell & Sireci, 2018), communicating growth (Zenisky, Keller, and Park, 2018), score reports for large-scale testing programs (Slater, Livingston, & Silver, 2018), and evaluating the use of interactive reports and dashboards in formative contexts (Brown, O'Leary, & Hattie, 2018;Feng, Krumm, & Grover, 2018;Corrin, 2018). ...

Communicating Measurement Error Information to Teachers and Parents

... First, the DRT analysis approach used in this study is just one of the many techniques used to compare response times between groups. For example, we used a multiple regression approach, while Ercikan et al. (2020) used the standardized mean difference (SMD, Zwick & Thayer, 1996) approach. In this study, we used a significance level of 0.05 as the cutoff for βs in the regression analysis to indicate the presence of DRT. ...

Evaluating the Magnitude of Differential Item Functioning in Polytomous Items
  • Citing Article
  • November 1997

Journal of Educational and Behavioral Statistics

... Shreiner and Dykes (2021) concluded in their study that the majority of social studies teachers who participated in their study did not include data literacy practices in their lessons. Some studies also point out that teachers have difficulties in using and interpreting data (Cowie & Cooper, 2016;Gelderblom et al, 2016;Zapata-Rivera et al., 2016). Teachers are expected to use data for teaching purposes in their decision-making processes and when designing their lessons. ...

Exploring the Effectiveness of a Measurement Error Tutorial in Helping Teachers Understand Score Report Results
  • Citing Article
  • July 2016

Educational Assessment

... De la información presentada en la tabla anterior podemos destacar los siguientes resultados: 1) el análisis paralelo de las nueve escalas indica que a cada una de ellas subyace una única dimensión; 2) el análisis de las cargas factoriales para cada escala muestra que en todos los casos se trata además de dimensiones muy robustas con saturaciones factoriales muy elevadas en todos los casos, 3) el análisis de los porcentajes de varianzas explicados por el factor latente apresado para cada escala también estaría señalando que las mismas son esencialmente unidimensionales según los criterios de Stout (1987) y Zwick (1985)-en concreto, la dimensión identificada en ocho de las nueve escalas tiene porcentajes de varianza superiores al 40%, porcentaje que es ligeramente inferior para la escala de conflicto (36.52%)-y 4) para todas las escalas se observa un buen ajuste de los datos al modelo de una única dimensión. Es decir, para todas las escalas los índices RMSEA son menores que .08 y los índices CFI y GFI son superiores a .95 ...

ASSESSMENT OF THE DIMENSIONALITY OF NAEP YEAR 15 READING DATA
  • Citing Article
  • June 1986

ETS Research Report Series

... The sequential structure of the Economics Ph.D. program implies that for all students, the steps appear in the same distinct order-Theory Comprehensive exam, Field Exam, and Ehrenberg and Mavros (1995) argue that GRE scores poorly measure student quality, which accounts for the lack of association with degree completion in their study. The absence of a statistically significant relationship between GRE score and degree completion is also reported by Zwick (1991), Zwick and Braun (1988), and Dawes (1975). On the other hand, Attiyeh and Attiyeh (1997) and Krueger and Wu (2000) show that GRE scores, especially from the quantitative section, strongly predict admission to economics doctoral programs. ...

METHODS FOR ANALYZING THE ATTAINMENT OF GRADUATE SCHOOL MILESTONES: A CASE STUDY
  • Citing Article
  • June 1988

ETS Research Report Series

... Here we discuss an idealized type of data, following the Guttman scaling [35,36]. In the homogeneous case [37] (see below), many exact results are known for the corresponding Pearson correlation matrix R hG [37,38]. As we will show, this allows to write an analytical expression for the von Neumann entropy S hG of R hG . ...

SOME PROPERTIES OF THE PEARSON CORRELATION MATRIX OF GUTTMAN-SCALABLE ITEMS
  • Citing Article
  • June 1986

ETS Research Report Series