Journal of applied measurement (J Appl Meas)

Description

Journal of Applied Measurement publishes refereed scholarly work from all academic disciplines that relates to measurement theory and its application to developing variables. The construction and interpretation of meaningful and unambiguous variables is a salient feature of measurement. It represents the congruence of measurement theory and substantive research in a wide range of scientific endeavors. The development of variables that map the persons and items onto a common metric, operational defined by the items, that are invariant across samples of persons and items, is a cornerstone of developing an understanding of the phenomena being measured and the construction and verification of hypotheses based on these phenomena. The journal will also publish invited articles that provide examples of methodological issues that are relevant to constructing useful variables.

  • Website
    Journal of Applied Measurement website
  • Other titles
    Journal of applied measurement
  • ISSN
    1529-7713
  • OCLC
    43888528
  • Material type
    Periodical
  • Document type
    Journal / Magazine / Newspaper

Publications in this journal

  • Article: A bootstrap approach to evaluating person and item fit to the Rasch model.
    [show abstract] [hide abstract]
    ABSTRACT: Historically, rule-of-thumb critical values have been employed for interpreting fit statistics that depict anomalous person and item response patterns in applications of the Rasch model. Unfortunately, prior research has shown that these values are not appropriate in many contexts. This article introduces a bootstrap procedure for identifying reasonable critical values for Rasch fit statistics and compares the results of that procedure to applications of rule-of-thumb critical values for three example datasets. The results indicate that rule-of-thumb values may over- or under-identify the number of misfitting items or persons.
    Journal of applied measurement 01/2013; 14(1):1-9.
  • Article: Using the dichotomous Rasch model to analyze polytomous items.
    [show abstract] [hide abstract]
    ABSTRACT: One of the most important applications of the Rasch measurement models in educational assessment is the equating of tests. An important feature of attainment tests is the use of both dichotomous and polytomous items. The partial credit model (PCM) developed by Masters (1982) represents an extension of the dichotomous Rasch model for analysing polytomous item data. The dichotomous Rasch model has been used primarily to analyse dichotomous item data. Whilst the partial credit model can provide detailed information on the performance of individual score categories of polytomous items, it is mathematically more complex to use than the dichotomous Rasch model and can, under certain circumstances, present difficulties in interpreting item measures and in practical applications. This study explores the potential of using the dichotomous Rasch model to analyse polytomous items and equate tests. Results obtained from a simulation study and from analysing the data of a science achievement test indicate that the partial credit model and the dichotomous Rasch model produce similar item and person measures and equivalent cut scores on different test forms.
    Journal of applied measurement 01/2013; 14(1):44-56.
  • Article: Rasch modeling to assess Albanian and South African learners' preferences for real-life situations to be used in mathematics: a pilot study.
    [show abstract] [hide abstract]
    ABSTRACT: This paper reports on an investigation on the real-life situations students in grades 8 and 9 in South Africa and Albania prefer to use in Mathematics. The functioning of the instrument used to assess the order of preference learners from both countries have for contextual situations is assessed using Rasch modeling techniques. For both the cohorts, the data fit the Rasch model. The differential item functioning (DIF) analysis rendered 3 items operating differentially for the two cohorts. Explanations for these differences are provided in terms of differences in experiences learners in the two countries have related to some of the contextual situations. Implications for interpretation of international comparative tests are offered, as are the possibilities for the cross-country development of curriculum materials related to contexts that learners prefer to use in Mathematics.
    Journal of applied measurement 01/2013; 14(1):91-105.
  • Article: With hiccups and bumps: the development of a Rasch-based instrument to measure elementary students' understanding of the nature of science.
    [show abstract] [hide abstract]
    ABSTRACT: This research describes the development process, psychometric analyses and part validation study of a theoretically-grounded Rasch-based instrument, the Nature of Science Instrument-Elementary (NOSI-E). The NOSI-E was designed to measure elementary students' understanding of the Nature of Science (NOS). Evidence is provided for three of the six validity aspects (content, substantive and generalizability) needed to support the construct validity of the NOSI-E. A future article will examine the structural and external validity aspects. Rasch modeling proved especially productive in scale improvement efforts. The instrument, designed for large-scale assessment use, is conceptualized using five construct domains. Data from 741 elementary students were used to pilot the Rasch scale, with continuous improvements made over three successive administrations. The psychometric properties of the NOSI-E instrument are consistent with the basic assumptions of Rasch measurement, namely that the items are well-fitting and invariant. Items from each of the five domains (Empirical, Theory-Laden, Certainty, Inventive, and Socially and Culturally Embedded) are spread along the scale's continuum and appear to overlap well. Most importantly, the scale seems appropriately calibrated and responsive for elementary school-aged children, the target age group. As a result, the NOSI-E should prove beneficial for science education research. As the United States' science education reform efforts move toward students' learning science through engaging in authentic scientific practices (NRC, 2011), it will be important to assess whether this new approach to teaching science is effective. The NOSI-E can be used as one measure of whether this reform effort has an impact.
    Journal of applied measurement 01/2013; 14(1):57-78.
  • Article: Using multidimensional Rasch to enhance measurement precision: initial results from simulation and empirical studies.
    [show abstract] [hide abstract]
    ABSTRACT: This study aimed to explore the effect on measurement precision of multidimensional, as compared with unidimensional, Rasch measurement for constructing measures from multidimensional Likert-type scales. Many educational and psychological tests are multidimensional but common practice is to ignore correlations among the latent traits in these multidimensional scales in the measurement process. These practices may have serious validity and reliability implications. This study made use of both empirical data from 208,083 students, and simulated data simulated by 24 systematic combinations, each replicated 1000 times, of three conditions, namely, sample size, degree of dimensionality, and scale length to compare unidimensional and multidimensional approaches and to identify effects of sample size, dimensionality and scale length on measurement precision. Results showed that the multidimensional Rasch approach yielded more precise estimates than did unidimensional approach if the two dimensions were strongly correlated. The effect was more pronounced for long scales.
    Journal of applied measurement 01/2013; 14(1):27-43.
  • Article: Using the Rasch measurement model to design a report writing assessment instrument.
    [show abstract] [hide abstract]
    ABSTRACT: This paper describes how the Rasch measurement model was used to develop an assessment instrument designed to measure student ability to write law enforcement incident and investigative reports. The ability to write reports is a requirement of all law enforcement recruits in the state of Michigan and is a part of the state's mandatory basic training curriculum, which is promulgated by the Michigan Commission on Law Enforcement Standards (MCOLES). Recently, MCOLES conducted research to modernize its training and testing in the area of report writing. A structured validation process was used, which included: a) an examination of the job tasks of a patrol officer, b) input from content experts, c) a review of the professional research, and d) the creation of an instrument to measure student competency. The Rasch model addressed several measurement principles that were central to construct validity, which were particularly useful for assessing student performances. Based on the results of the report writing validation project, the state established a legitimate connectivity between the report writing standard and the essential job functions of a patrol officer in Michigan. The project also produced an authentic instrument for measuring minimum levels of report writing competency, which generated results that are valid for inferences of student ability. Ultimately, the state of Michigan must ensure the safety of its citizens by licensing only those patrol officers who possess a minimum level of core competency. Maintaining the validity and reliability of both the training and testing processes can ensure that the system for producing such candidates functions as intended.
    Journal of applied measurement 01/2013; 14(1):10-26.
  • Article: Understanding Rasch measurement: Rasch models overview.
    [show abstract] [hide abstract]
    ABSTRACT: This overview of Rasch measurement models begins with a conceptualization of our continuous experiences that are often captured as discrete observations. It goes on to discuss the properties that are required of measures if they are to transcend the occasion in which they were collected, and concludes with a discussion of the spiral of inferential development. This is followed by a discussion of the mathematical properties of the Rasch family of models that allow the transformation of discrete deterministic counts into continuous probabilistic abstractions on which science is based. The overview concludes with a discussion of six of the family of Rasch models, Binomial Trials, Poisson Counts, Rating Scale, Partial Credit, and Ranks and the types of data for which these models are appropriate. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
    Journal of applied measurement 10/2012;
  • Article: Measuring Work Stress among Correctional Staff: A Rasch Measurement Approach.
    [show abstract] [hide abstract]
    ABSTRACT: Today, the amount of stress the correctional staff endures at work is an important issue. Research has addressed this issue, but has yielded no consensus as to a properly calibrated measure of perceptions of work stress for correctional staff. Using data from a non-random sample of correctional staff (n = 228), the Rasch model was used to assess whether a specific measure of work stress would fit the model. Results show that three items rather than six items accurately represented correctional staff perceptions of work stress.
    Journal of applied measurement 01/2012; 13(4):394-402.
  • Article: Is the partial credit model a Rasch model?
    [show abstract] [hide abstract]
    ABSTRACT: A balance scale metaphor is offered as a tool for explaining the principles of measurement and for visualizing the internal structure of dichotomous and polytomous Rasch models. The balance scale metaphor is used to guide the derivation of a general polytomous Rasch model and to illustrate the additional assumptions subsequently required to derive the Andrich (1978) rating scale model (RSM) and the Masters (1982) partial credit model (PCM). The metaphor is used to present the argument that the RSM conforms to the rules of measurement, but the PCM has interactions implicit in its structure that violate specific objectivity and sufficiency of raw scores, which challenge its status as a Rasch model. Using the metaphor and a literal interpretation of the narrative description of the PCM by Masters (1982), a new version of the PCM is derived that does conform to the rules of measurement.
    Journal of applied measurement 01/2012; 13(2):114-31.
  • Article: Item set discrimination and the unit in the Rasch model.
    [show abstract] [hide abstract]
    ABSTRACT: The aim is to show that it is possible to parameterize discrimination for sets of items, rather than individual items, without destroying conditions for sufficiency in a form of the Rasch model. The form of the model is obtained by formalizing the relationship between discrimination and the unit of a metric. The raw score vector across item sets is the sufficient statistic for the person parameter. Simulation studies are used to show the implementation of conditional estimation solution equations based on the relevant form of the Rasch model. The model also applied to two numeracy tests attempted by a group of common persons in a large-scale testing program. The results show improved fit compared with the Rasch model in its standard form. They also show the units of the scales were more accurately equated. The paper discusses implications for applied measurement using Rasch models and contrasts the approach with the application of the two parameter logistic (2PL) model.
    Journal of applied measurement 01/2012; 13(2):165-80.
  • Source
    Article: A rasch measure of teachers' views of teacher-student relationships in the primary school.
    [show abstract] [hide abstract]
    ABSTRACT: This study investigated teacher-student relationships from the teachers' point of view at Perth metropolitan schools in Western Australia. The study identified three key social and emotional aspects that affect teacher-student relationships, namely, Connectedness, Availability and Communication. Data were collected by questionnaire (N = 139) with stem-items answered in three perspectives: (1) Idealistic: this is what I would like to happen; (2) Capability: this is what I am capable of; and (3) Behaviour: this is what actually happens, using four ordered response categories: not at all (score 1), some of the time (score 2), most of the time (score 3), and almost always (score 4). Data were analysed with a Rasch measurement model and a uni-dimensional, linear scale with 24 items, ordered from easy to hard, was created. The data were shown to be highly reliable, so that valid inferences could be made from the scale. The Person Separation Index (akin to a reliability index) was 0.93; there was good global teacher and item fit to the measurement model; there was good item fit; the targeting of the item difficulties against the teacher measures was good, and the response categories were answered consistently and logically. Teachers said that the ideal items were all easier than their corresponding capability items which were in turn easier than the behaviour items (where the items fitted the model), as conceptualized. The easiest ideal items were: I like this child and This child and I get along well together. The hardest ideal item (but still easy) was: I am available for this child. The easiest behaviour item (but still hard) was: This child and I get along well together. The hardest behaviour item (and very hard) was: I am interested to learn about this child's personal thoughts, feelings and experiences. The difficulties of the items supported the conceptual structure of the variable.
    Journal of applied measurement 01/2012; 13(4):403-27.
  • Article: Beliefs about Language Development: Construct Validity Evidence.
    [show abstract] [hide abstract]
    ABSTRACT: Understanding language development is incomplete without recognizing children's sociocultural environments, including adult beliefs about language development. Yet there is a need for data supporting valid inferences to assess these beliefs. The current study investigated the psychometric properties of data from a survey (MODeL) designed to explore beliefs in the popular culture, and their alignment with more formal theories. Support for the content, substantive, structural, generalizability, and external aspects of construct validity of the data were investigated. Subscales representing Behaviorist, Cognitive, Nativist, and Sociolinguistic models were identified as dimensions of beliefs. More than half of the items showed a high degree of consensus, suggesting culturally-transmitted beliefs. Behaviorist ideas were most popular. Bilingualism and ethnicity were related to Cognitive and Sociolinguistic beliefs. Identifying these beliefs may clarify the nature of child-directed speech, and enable the design of language intervention programs that are congruent with family and cultural expectations.
    Journal of applied measurement 01/2012; 13(4):336-59.
  • Article: Formulating latent growth using an explanatory item response model approach.
    [show abstract] [hide abstract]
    ABSTRACT: In this paper, we present a way to extend the Hierarchical Generalized Linear Model (HGLM; Kamata (2001), Raudenbush (1995)) to include the many forms of measurement models available under the formulation known as the Random Coefficients Multinomial Logit (MRCML) Model (Adams, Wilson and Wang, 1997), and apply that to growth modeling. First, we review two different traditions in modeling growth studies: the first is based in the hierarchical linear modeling (HLM) tradition, and the second, which is the topic of this paper, is rooted in the Rasch measurement tradition - this is the linear Latent Growth Item Response Model (LG-IRM). Going beyond the linear case, the LG-IRM approach allows us to considerably extend the range of models available in the HLM tradition to incorporate several of the extensions of IRT models that are used in creating explanatory item response models (EIRM; De Boeck and Wilson, 2004). We next present a number of extensions - including polynomial growth modeling, differential item functioning (DIF) effects, growth functions that can be approximated by polynomial expressions, provision for polytomous responses, person and item covariates (and time varying covariates), and multiple dimensions of growth. We provide two empirical examples to illustrate several of the models, using the ConQuest software (Wu, Adams, Wilson and Haldane, 2008) to carry out the analyses. We also provide several simulations to investigate the success of the estimation procedures.
    Journal of applied measurement 01/2012; 13(1):1-22.
  • Article: Exploring the alignment of writing self-efficacy with writing achievement using rasch measurement theory and qualitative methods.
    [show abstract] [hide abstract]
    ABSTRACT: Alignment of writing self-efficacy and writing achievement is defined as the congruence between student confidence regarding writing skills (writing self-efficacy) and the actual performance on these writing skills as reflected in teacher grades (achievement). One purpose of this study is to examine the relationship between these two variables. A second purpose is to demonstrate a mixed-methods approach to investigating relationships between affective variables using Rasch measurement and interviews. Participants were eighth grade students (N = 94) from an ethnically and socioeconomically diverse school in the southeastern United States. Our results suggest that students who struggle with the mechanics of writing yet appreciate the expressive capacity of writing, may have higher senses of writing self-efficacy that are not predictive of performance.
    Journal of applied measurement 01/2012; 13(2):132-45.
  • Article: Concurrent Validation of CHIRP, a New Instrument for Measuring Healthcare Student Attitudes towards Interdisciplinary Teamwork.
    [show abstract] [hide abstract]
    ABSTRACT: Positive attitudes towards teamwork among health care professionals are critical to patient safety. The purpose of this study is to describe the development and concurrent validation of a new instrument to measure attitudes towards healthcare teamwork that is generalizable across various populations of healthcare students. The Collaborative Healthcare Interdisciplinary Planning (CHIRP) scale was validated against the Readiness for Inter-Professional Learning Scale (RIPLS). Analyses included student (n = 266) demographics, ANOVA, internal consistency, factor analysis, and Rasch analysis. The two instruments correlated at r = .582. The CHIRP showed a multifactorial structure having excellent internal consistency (alpha = .850), with 25 of the 36 scale items loading onto a single Teamwork Attitudes factor. The RIPLS likewise had strong internal consistency (alpha = .796) and a three-factor structure, supporting previous studies of the instrument. However, Rasch analyses showed 14 (38.9%) of the 36 CHIRP items, but only four (21.1%) of the 19 RIPLS items remaining within the satisfactory standardized OUTFIT zone of 2.0 standard deviation units. We propose the 14 fitting items as a new, validated teamwork attitudes scale.
    Journal of applied measurement 01/2012; 13(4):360-75.

Keywords

data
 
equating
 
estimat
 
item
 
measur
 
measurement
 
model
 
pisa
 
rasch
 
rating
 
scale
 
statistic
 
student
 
test
 
using
 

Related Journals