Geofferey N. Masters’s research while affiliated with University of Melbourne and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (12)


Educational measurement: Prospects for research and innovation
  • Article

December 1988

·

1 Read

·

2 Citations

The Australian Educational Researcher

Geofferey N. Masters

Reforming the Assessment of Student Achievement in the Senior Secondary School

November 1988

·

10 Reads

·

10 Citations

Australian Journal of Education

The challenge that confronts agencies responsible for assessment and reporting in the senior secondary school is to extend systematic assessment procedures to a broader range of learning outcomes than those currently assessed by public examinations, to develop methods of reporting which are more descriptive of individual achievement and which provide a better basis for describing and maintaining standards, and to provide results which are sufficiently comparable across schools to enable fair comparisons of applicants for tertiary study. Some recent developments in assessment and reporting practice are considered with a view to identifying methods and approaches capable of satisfying this diverse set of demands. An approach which is particularly appealing because of its potential to provide simultaneously more descriptive reports of student achievement and adequate levels of comparability is the use of a set of common assessment tasks attempted by all students enrolled in each Year 12 course of study.


The Analysis of Partial Credit Scoring

October 1988

·

311 Reads

·

86 Citations

Applied Measurement in Education

This article discusses a range of issues in the practical application of an item response theory (IRT) method for partial credit scoring. After a brief discussion of partial credit scoring as an alternative to right-wrong scoring in the measurement of educational achievement, an IRT model for partial credit analysis is developed and described. This model is presented as a straightforward and logical application of Rasch's dichotomous model to a sequence of ordered response alternatives. The distinctive nature of the item parameters in the model is described and these parameters are contrasted with two more familiar sets of parameters: Thurstone thresholds and the difficulties of dichotomously scored subitems. Issues in marking out and interpreting variables using this model are discussed. Brief mention is made of several special cases of the partial credit model that may be useful in particular applications and for particular kinds of test and questionnaire data.


Anchor Tests, Score Equating and Sex Bias

April 1988

·

4 Reads

·

2 Citations

Australian Journal of Education

This paper discusses the use of anchor tests (scaling tests) to bring two or more sets of scores to a common scale. Particular attention is given to the rescaling of school-based assessments against an external test or examination and to potential sources of bias in this procedure. The need for routine validity checks is emphasized, and a latent trait approach to constructing a statistical framework for tests and examination score equating is described and illustrated. Bias caused by rescaling school assessments against an inappropriate anchor test is illustrated using a 1984 attempt to rescale students' assessments in English against the Australian Scholastic Aptitude Test.


Item Discrimination: When More Is Worse

March 1988

·

57 Reads

·

116 Citations

Journal of Educational Measurement

High item discrimination can be a symptom o f a special kind of measurement disturbance introduced by an item that gives persons o f high ability a special advantage over and above their higher abilities. This type o f disturbance, which can be interpreted as a form o f item “bias,” can be encouraged by methods that routinely interpret highly discriminating items as the “best” items on a test and may be compounded by procedures that weight items by their discrimination. The type of measurement disturbance described and illustrated in this paper occurs when an item is sensitive to individual differences on a second, undesired dimension that is positively correlated with the variable intended to be measured. Possible secondary influences o f this type include opportunity to learn, opportunity to answer, and test wiseness


Measurement Models for Ordered Response Categories

January 1988

·

11 Reads

·

40 Citations

Quantitative educational research depends on the availability of carefully constructed variables. The construction and use of a variable begin with the idea of a single dimension or line on which students can be compared and along which progress can be monitored. This idea is operationalized by inventing items intended as indicators of this latent variable and using these items to elicit observations from which students’ positions on the variable might be inferred.


Banking Non-Dichotomously Scored Items

December 1986

·

39 Reads

·

18 Citations

Applied Psychological Measurement

A method for constructing a bank of items scored in two or more ordered response categories is de scribed and illustrated. This method enables multistep problems, rating scale items, question "clusters," and other items using partial credit scoring to be calibrated and incorporated into an item bank, and it provides a mechanism for computer adaptive testing with items of this type. Procedures are described for calibrating an initial set of items, for testing the fit of items to the underlying measurement model, and for linking new items to an existing item bank. The method is illus trated using items from the Watson-Glaser Critical Thinking Appraisal.


A Comparison of Latent Trait and Latent Class Analyses of Likert-Type Data

March 1985

·

39 Reads

·

44 Citations

Psychometrika

We empirically test existing theories on the provision of public goods, in particular air quality, using data on sulfur dioxide (SO2) concentrations from the Global Environment Monitoring Projects for 107 cities in 42 countries from 1971 to 1996. The results are as follows: First, we provide additional support for the claim that the degree of democracy has an independent positive effect on air quality. Second, we find that among democracies, presidential systems are more conducive to air quality than parliamentary ones. Third, in testing competing claims about the effect of interest groups on public goods provision in democracies we establish that labor union strength contributes to lower environmental quality, whereas the strength of green parties has the opposite effect.


Common-Person Equating with the Rasch Model

March 1985

·

40 Reads

·

35 Citations

Applied Psychological Measurement

Two procedures, one based on item difficulties, the other based on person abilities, were used to equate 14 forms of a reading comprehension test using the Rasch model. These forms had no items in common. For practical purposes, the two procedures produced equivalent results. An advantage of common-person equating for testing the unidimensionality assumption is pointed out, and the need for caution in interpreting tests of common-item invariance is stressed.


The Essential Process in a Family of Measurement Models

December 1984

·

9 Reads

·

179 Citations

Psychometrika

We empirically test existing theories on the provision of public goods, in particular air quality, using data on sulfur dioxide (SO2) concentrations from the Global Environment Monitoring Projects for 107 cities in 42 countries from 1971 to 1996. The results are as follows: First, we provide additional support for the claim that the degree of democracy has an independent positive effect on air quality. Second, we find that among democracies, presidential systems are more conducive to air quality than parliamentary ones. Third, in testing competing claims about the effect of interest groups on public goods provision in democracies we establish that labor union strength contributes to lower environmental quality, whereas the strength of green parties has the opposite effect.


Citations (10)


... One function of these models is to examine behavior on measures intended to capture hierarchies of difficulty, which makes them highly suitable for developmental applications. The Rasch model tests the assumption that performances and items (or levels of items) form an invariant, hierarchical sequence (within probabalistically determined constraints) that can be successfully modeled along a single continuum (Andrich, 1989;Fisher, 1994;Masters, 1988;Wright & Linacre, 1989). In their raw form, little can be said about the relative distances between stage scores (Duncan, 1984;Michell, 1990;Thurstone, 1959;van der Linden, 1994). ...

Reference:

The Good Life A Longitudinal Study of Adult Value Reasoning
Measurement Models for Ordered Response Categories
  • Citing Chapter
  • January 1988

... Performance assessment (PA) as a "situated" mode of assessment suggests that the test items should require some kind of active demonstration of the knowledge in question rather than a prepositional account of it (Moss, 1992). Some actions are required in a realistic setting, involving enactment of a skill or problem solving (Masters & Hill, 1988; Wiggins, 1989). Actual performance will be observed as verbal, non-verbal, fine motor skills, gross motor skills, or a combination of these. ...

Reforming the Assessment of Student Achievement in the Senior Secondary School
  • Citing Article
  • November 1988

Australian Journal of Education

... Modern approaches to assessing students' academic achievements are based on the use of classical testing theory and Item Response Theory (IRT). The mathematical background of pedagogical measurement theory was created by Andersen [2,3], Andrich [5], Avanesov [7], Birnbaum [9], Guttman [18], Linacre [23], Lord et al. [25], Maslak et al. [28], Masters [29], Rasch [31] and other scientists. In IRT, the concept of a latent variable is used. ...

Educational measurement: Prospects for research and innovation
  • Citing Article
  • December 1988

The Australian Educational Researcher

... donde se ha tenido en cuenta que y Esta expresión general (2) es conocida como Modelo de Crédito Parcial (MCP) (Masters, 1982(Masters, , 1988b(Masters, , 1988cMasters y Hyde, 1984;Masters y Wright 1984 La condición que impone el Modelo de Escalas de Clasificación a los parámetros τ m es que éstos se mantengan constantes a través de los ítems, y se asume que dependen solamente de las alternativas de respuestas propuestas. Por ello, la única diferencia entre los ítems es su localización δ i en la variable. ...

Measuring Attitude to School with a Latent Trait Model
  • Citing Article
  • January 1984

Applied Psychological Measurement

... The item information functions for all divide-by-total models for ordered categories are calculated according to Masters and Evans (1986;[38], p. 362, Equation (3)), those of the GRM according to Samejima (1968;[18], p. 60, Equation (6-6)), and those of the NRM according to Bock (1970;[27], p. 44, Equations (24) and (25)). The category information functions are obtained according to Muraki (1993;[39], p. 354, Equation (13)). ...

Banking Non-Dichotomously Scored Items
  • Citing Article
  • December 1986

Applied Psychological Measurement

... This equating study uses concurrent calibration with the partial credit model (PCM), a Rasch-type model for polytomous response options (Masters, 1982(Masters, , 1985. In the following, for simplicity, we will refer to both the Rasch model for dichotomous responses (Rasch, 1960) and extension of the Rasch model for polytomous responses (Masters, 1982) as ''Rasch models.'' ...

Common-Person Equating with the Rasch Model
  • Citing Article
  • March 1985

Applied Psychological Measurement

... Die Leistungsdaten wurden zunächst unter Anwendung eines Partial-Credit-Modells (Masters, 1982;1988) skaliert. Um auf dieser Basis beantworten zu können, ob eine mehrdimensionale Struktur der interprofessionellen Kooperationskompetenz empirisch abgebildet werden kann (FF2), wurden die Devianzen des fünfdimensionalen Modells (Wissen über die eigene pflegeberufliche Rolle, Wissen über die Rolle der gesundheitsberuflich tätigen Kooperationspartner:innen, latente Rollendistanz, Rollenübernahme und auf die zu pflegende Person bezogene Rollenkoordination) und des zweidimensionalen Modells (Wissen über berufliche Rollen und rollenbezogene Fähigkeiten) mit einem eindimensionalen Modell verglichen. ...

The Analysis of Partial Credit Scoring
  • Citing Article
  • October 1988

Applied Measurement in Education

... A questionnaire survey was conducted in a public housing estate to obtain the residents' subjective comfort feeling for the living room luminous environment. Participants chose the comfort level based on the Likert 5-point scale [40]. The type of the buildings is selected as Harmony 1 with the building plan shown in Fig. 1a. ...

A Comparison of Latent Trait and Latent Class Analyses of Likert-Type Data
  • Citing Article
  • March 1985

Psychometrika

... Many-faceted measurement models (Fischer, 1973;Linacre, 1989Linacre, /1994) expand on probabilistic models for measurements emerging from multiple independent sources over the course of the twentieth century. The family of measurement models (Masters & Wright, 1984;Wright & Mok, 2000) routinely falling under the heading of "Rasch measurement theory" (Rasch, 1960(Rasch, /1980) are conceptually identical with (a) models for paired comparisons previously developed by Zermelo (Linacre, 2000a) and by Bradley-Terry (1952) and Luce (1959) (Andrich, 1988, p. 43;Linacre, 1995), and (b) additive conjoint (Luce & Tukey, 1964) formulations of fundamental measurement (Newby et al., 2009;. Perhaps most remarkably, C. S. Peirce notably put all the pieces together for a log-odds measurement model in 1878 (Linacre, 2000b). ...

The Essential Process in a Family of Measurement Models
  • Citing Article
  • December 1984

Psychometrika