Albert E. Beaton’s research while affiliated with Boston College and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (16)


Large-Scale Group-Score Assessment
  • Chapter
  • Full-text available

October 2017

·

489 Reads

·

7 Citations

Albert E. Beaton

·

John L. Barone

Large-scale group-score assessments are widely used to inform educational policymakers about the needs and accomplishments of various populations and subpopulations. The purpose of this chapter is to chronicle ETS’s technical contributions in this area. The focus is on a subset of important educational achievement testing programs, ones that assess large populations (e.g., the United States as a whole, an individual state), use population-defining variables (e.g., racial/ethnic, gender), and include consideration of other policy-relevant factors (e.g., the number of hours watching TV, number of mathematics courses taken).

Download

Overview of the Scaling Methodology Used in the National Assessment

September 2005

·

26 Reads

·

27 Citations

Journal of Educational Measurement

The National Assessment of Educational Progress (NAEP) uses item response theory (IRT)–based scaling methods to summarize the information in complex data sets. Scale scores are presented as tools for illuminating patterns in the data and for exploiting regularities across patterns of responses to tasks requiring similar skills. In this way, the dominant features of the data are captured. Discussed are the necessity of global scores or more detailed subscores, the creation of developmental scales spanning different age levels, and the use of scale anchoring as a way of interpreting the scales.



Comparing cross-national student performance on TIMSS using different test items

September 1998

·

8 Reads

·

13 Citations

International Journal of Educational Research

The question addressed in this chapter is how “fair” the TIMSS tests were to the various participating countries. The Test-Curriculum Matching Analysis (TCMA) method was used to investigate how results might have changed if different subsets of TIMSS items were considered. The method computes the average proportion correct for each country on each selection of appropriate items. The results of the TCMA is a square matrix, with the rows representing the various results for each country and the columns representing the different items sets deemed appropriate for each country. The results suggested that the relative positions of the countries changed very little as a result of the item selection.




Mathematics and Science Achievement in the Final Year of Secondary School: IEA's Third International Mathematics and Science Study (TIMSS).

January 1998

·

338 Reads

·

281 Citations

The Third International Mathematics and Science Study (TIMSS) covered five different grade levels, with more than 40 countries collecting data in more than 30 different languages. More than a million students were tested. The present report contains the TIMSS results for students in the final year of secondary school. Mathematics and science literacy achievement results are reported for 21 countries; advanced mathematics results and physics results, respectively, are reported for 16 countries. These results complete the first round of descriptive reports from the TIMSS study. Together with the results for primary school students (third and fourth grade in most countries) and middle school students (seventh and eighth grades in most countries), the results contained in this report provide valuable information about the relative effectiveness of a country's education system as students progress through school. A ten-page Executive Summary details the extensive conclusions to be drawn from the study. Dozens of tables and figures provide detailed statistics for all participating countries. The Netherlands and Sweden were the top performing countries in mathematics; France was the top performer in advanced mathematics; Norway and Sweden had physics achievement levels significantly higher than other participating countries. The appendixes contain extensive information pertaining to the development of the TIMSS tests, sample sizes and participation rates, compliance with sampling guidelines, and the test-curriculum matching analysis. (DDR)



Providing Data for Educational Policy in an International Context: The Third International Mathematics and Science Study (TIMSS)

January 1997

·

23 Reads

·

16 Citations

European Journal of Psychological Assessment

Policy-makers in many nations are involved in educational reforms. To make effective educational decisions for the 21st century, policy-makers need information of a wide variety of kinds, such as comparative performance data and curriculum information from other nations. International surveys provide a broad base of information and allow countries to view their current status and planning within an international perspective. This paper describes the goals of the Third International Mathematics and Science Study (TIMSS). This study is the largest international comparative study of student achievement in mathematics and science ever attempted. 45 countries are participating and focus is on 9-yr-olds, 13-yr-olds, and students in their final year of secondary education. TIMSS was designed to obtain data that is meaningful in an international context and that will contribute to the formation of educational policy and reform in mathematics and science education in the participating countries. (PsycINFO Database Record (c) 2012 APA, all rights reserved)


Figure 1.3
Table 1 TIMSS Countries Testing in the Primary Grades 1
Table 1 .1 Distributions of Achievement in the Sciences -Upper Grade (Fourth Grade*)
Table 1 .2 Distributions of Achievement in the Sciences -Lower Grade (Third Grade*)
Table 1 .3

+47

Mathematics Achievement in the Middle School Years: IEA's Third International Mathematics and Science Study (TIMSS)

January 1996

·

3,791 Reads

·

781 Citations

A recently completed landmark study of mathematics and science education in more than 40 countries gathered information that can help address questions as to why students in one country do better in math and science than students in another. This report focuses on the results of the primary school science test of students in 26 countries, from the Third International Mathematics and Science Study (TIMMS). Details of how the study was conducted, the nature of the science test, country characteristics, differences in student achievement, student achievement by science content area, and an analysis of example problems are included. Ideas of intended and implemented curricula are discussed and a number of questions related to these ideas that TIMMS may answer are listed. (DDR)


Citations (15)


... Large-scale assessments, on the other hand, aim to accurately report on what groups of participants know and can do. Because of this, large-scale assessments are at times referred to as group-score assessments (Beaton & Barone, 2017;Mazzeo et al., 2006) to explicitly distinguish them from individual-score assessments. Participants in a largescale assessment do not receive individual score reports, and no action that specifically affects them are expected to be taken based on their performance on the assessment. ...

Reference:

Considerations for the use of plausible values in large-scale assessments
Large-Scale Group-Score Assessment

... The outstanding levels of performance among East Asian countries became even more apparent in TIMSS 1995, when the four top performers were all from the region: Singapore, Korea, Japan and Hong Kong, both in eighth-grade (out of thirty-nine participating countries) and fourthgrade mathematics (out of twenty-five countries) (Harmon et al. 1997). In TIMSS 1999 the best-performing countries in mathematics were Singapore, Korea, Taiwan, Hong Kong, and Japan (out of thirty-eight participants) (Martin et al., 2000;Mullis et al., 2000). ...

Performance Assessment in IEA's Third International Mathematics and Science Study
  • Citing Book
  • January 1997

... The higher competition and rapid progress in a globally changing economic and technological environment have been one of the driving forces for enhancing educational accountability in many countries (Martin et al., 1998). Thus it is undeniably essential for a nation to improve its standards of teaching, research and practice in science, mathematics, technology and engineering. ...

Science achievement in Missouri and Oregon in an international context: 1997 TIMSS benchmarking
  • Citing Article
  • January 1998

... Part of the goal for using performance standards is that they help us clarify the level of mastery needed for college-level courses, for a degree, or for a profession (Fields, 2011). Like all standards, they too have been met with sharp criticism, particularly when they are used to compare and enhance large-scale measurement; as such, some advocate for the use of achievement levels via the achievement level description method (Beaton, Linn & Bohrnstedt, 2012). However, they have also been used to define the level at which a student must perform at a given grade level, rather than more open-endedly describe a progression of mastery. ...

Alternative Approaches to Setting Performance Standards for the National Assessment of Educational Progress (NAEP)

... To adequately cover the broad range of contents, a large pool of assessment items is necessary. To limit the assessment time, costs and minimize test fatigue, many LSAs utilize matrix item sampling (Beaton and Zwick, 1992;Mislevy et al., 1992) and each sampled student is administered only a small fraction of items from the item pool. The subset of items each student receives may differ in terms of properties and content (e.g., content domain distribution, item difficulty, and reliability) which result in missingness by design. ...

Chapter 1: Overview of the National Assessment of Educational Progress
  • Citing Article
  • June 1992

Journal of Educational and Behavioral Statistics

... After estimating the parameters, short scale construction was performed. In this work was chosen the methodology proposed by Beaton and Allen (1992), which seeks, for each level of the scale, to identify if there are items whose discriminatory power is around this level and to use these items to describe what respondents whose skills are close to this level know and can do. For the selection of anchor items, the selection criteria proposed by Beaton and Allen (1992) were adopted, with anchor items formally defined as follows: ...

Chapter 6: Interpreting Scales Through Scale Anchoring
  • Citing Article
  • June 1992

Journal of Educational and Behavioral Statistics

... Another argument suggests that the instruments of evaluation and assessment that provide the evidence policymakers use are themselves unreliable (Beaton, 1998;Bracey, 2004;Ercikan, 1998;Foy, 2005). Reliability of instruments is an issue from the local level all the way up to the global level. ...

Comparing cross-national student performance on TIMSS using different test items
  • Citing Article
  • September 1998

International Journal of Educational Research

... Hence, with respect to instruction, the results could be used to create (manageable) barriers in teaching-learning processes for promoting competencies systematically (see Hartig et al., 2012;Klotz et al., 2015;Nickolaus, 2016). Therefore, Hartig's method is preferred in VET (e.g., Behrendt et al., 2017;Klotz et al., 2015;Schumann & Eberle, 2011) in contrast to post hoc task analyses (Beaton & Allen, 1992). ...

Interpreting Scales Through Scale Anchoring
  • Citing Article
  • July 1992

Journal of Educational Statistics

... There is great wisdom in the classic maxim on this point-"If you want to measure change, don't change the measure." [26]-but changes in people's lives will necessitate changes in measurement; surveys like the HRS will need to use techniques along the lines of those deployed here to help calibrate the impacts of such changes on measures of key quantities. ...

The Effect of Changes in the National Assessment: Disentangling the NAEP 1985-86 Reading Anomaly. Revised
  • Citing Article
  • January 1990

... However, the results of TIMSS, and ILSAs in general, can be used by policymakers in several ways, ranging from justifying decisions in order to strengthen educational quality, to ensuring equity and understanding what happens in education from an accountability perspective (Wiseman, 2010). This policy-informing aim of TIMSS, and ILSAs in general, is nurtured by the studies' intention of providing insight into the outcomes of education (Beaton, Martin, & Mullis, 1997;OECD, 1999;Tobin, Nugroho, & Lietz, 2016). This is particularly true when it comes to quantitative data generated by ILSAs that try to reveal 'what works' in education (Fischman, Topper, Silova, Goebel, & Holloway, 2019;Phillips & Ochs, 2003;Wiseman, 2010). ...

Providing Data for Educational Policy in an International Context: The Third International Mathematics and Science Study (TIMSS)

European Journal of Psychological Assessment