
Björn AnderssonUniversity of Oslo · Centre for Educational Measurement
Björn Andersson
PhD
About
30
Publications
3,028
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
236
Citations
Citations since 2017
Introduction
Additional affiliations
January 2018 - present
September 2015 - October 2017
September 2011 - August 2015
Publications
Publications (30)
In diagnostic classification models (DCMs), the Q matrix encodes in which attributes are required for each item. The Q matrix is usually predetermined by the researcher but may in practice be misspecified which yields incorrect statistical inference. Instead of using a predetermined Q matrix, it is possible to estimate it simultaneously with the it...
Numerical quadrature methods are needed for many models in order to approximate integrals in the likelihood function. In this note, we correct the error rate given by Liu & Pierce (1994) for integrals approximated with adaptive Gauss–Hermite quadrature and show that the approximation is less accurate than previously thought. We discuss the relation...
The estimation of high-dimensional latent regression item response theory (IRT) models is difficult because of the need to approximate integrals in the likelihood function. Proposed solutions in the literature include using stochastic approximations, adaptive quadrature, and Laplace approximations. We propose using a second-order Laplace approximat...
Reliability of scores from psychological or educational assessments provides important information regarding the precision of measurement. The reliability of scores is however population dependent and may vary across groups. In item response theory, this population dependence can be attributed to differential item functioning or to differences in t...
This study aimed to determine the dimensionality of morphological knowledge by examining different sources of variance. According to the Morphological Pathways Framework (Levesque et al., Journal of Research in Reading, 44, 10–26, 2021), morphological awareness, morphological analysis and morphological decoding are related, but distinct dimensions...
Background
The Home and Community Social Behavior Scales (HCSBS) is a rating scale that assesses social competence and antisocial behavior among children and youths between ages 5–18. The present study aimed to investigate the psychometric properties of the HCSBS by applying item response theory (IRT).
Methods
The HCSBS was completed by parents of...
Introduction:
The Montreal Cognitive Assessment (MoCA) has started to be used in longitudinal investigations to measure cognition trends but its measurement properties over time are largely unknown. This study aimed to examine the longitudinal measurement invariance of individual MoCA items.
Method:
We used four waves of data collected between 2...
Enabling comparable scores across grades is of interest for policymakers to evaluate educational systems, for researchers to investigate substantive questions, and for teachers to infer student growth. This study implemented a vertical scaling design to numeracy tests given in grades 5 and 8 as part of the Norwegian national testing system. Our des...
The certainty of response index (CRI) measures respondents' confidence level when answering an item. In conjunction with the answers to the items, previous studies have used descriptive statistics and arbitrary thresholds to identify student knowledge profiles with the CRIs. Whereas this approach overlooked the measurement error of the observed ite...
In standardized testing, equating is used to ensure comparability of test scores across multiple test administrations. One equipercentile observed-score equating method is kernel equating, where an essential step is to obtain continuous approximations to the discrete score distributions by applying a kernel with a smoothing bandwidth parameter. Whe...
Competitiveness as a personality trait is commonly viewed as having three dimensions – competing to win (CW; to dominate and suppress others unscrupulously), competing to surpass (CS; to surpass or excel above others), and competing to develop (CD; to focus on personal development). Using a sample of 926 participants in mainland China, this study e...
As one of the important research areas of cognitive diagnosis assessment, cognitive diagnostic computerized adaptive testing (CD-CAT) has received much attention in recent years. Measurement accuracy is the major theme in CD-CAT, and both the item selection method and the attribute coverage have a crucial effect on measurement accuracy. A new attri...
Background: Bridging scores generated from different cognitive assessment tools is necessary to efficiently track changes in cognition across the continuum of care. This study linked scores from the Montreal Cognitive Assessment-5 min (MoCA 5-min) to the interRAI cognitive Performance Scale (CPS), commonly adopted tools in clinical and long-term ca...
Log-file data from computer-based assessments can provide useful collateral information for estimating student abilities. In turn, this can improve traditional approaches that only consider response accuracy. Based on the amounts of time students spent on 10 mathematics items from the PISA 2012, this study evaluated the overall changes in and measu...
We applied latent class analysis and the rule space model to verify the cumulative characteristic of conceptual change by developing a learning progression for buoyancy. For this study, we first abstracted seven attributes of buoyancy and then developed a hypothesized learning progression for buoyancy. A 14-item buoyancy instrument was administered...
The traditional application of the Montreal Cognitive Assessment uses total scores in defining cognitive impairment levels, without considering variations in item properties across populations. Item response theory (IRT) analysis provides a potential solution to minimize the effect of important confounding factors such as education. This research a...
Background: The expected growth in dementia prevalence mandates an efficient and cost-effective triage system for timely tertiary prevention. Fast and accurate cognitive screening assessments such as the Montreal Cognitive Assessment (MoCA) are essential tools in the system, although the effects of education posted some challenges. We investigated...
Diagnostic classification models (DCMs) have been widely used in education, psychology, and many other disciplines. To select the most appropriate DCM for each item, the Wald test has been recommended. However, prior research has revealed that this test provides inflated Type I error rates. To address this problem, the authors propose to replace th...
Two new methods to estimate the asymptotic covariance matrix for marginal maximum likelihood estimation of cognitive diagnosis models (CDMs), the inverse of the observed information matrix and the sandwich-type estimator, are introduced. Unlike several previous covariance matrix estimators, the new methods take into account both the item and struct...
In item response theory (IRT), when two groups from different populations take two separate tests, there is a need to link the two ability scales so that the item parameters of the tests are comparable across the groups. To link the two scales, information from common items are utilized to estimate linking coefficients which place the item paramete...
In applications of item response theory (IRT), an estimate of the reliability of the ability estimates or sum scores is often reported. However, analytical expressions for the standard errors of the estimators of the reliability coefficients are not available in the literature and therefore the variability associated with the estimated reliability...
Writing assessments are an indispensable part of most language competency tests. In our research, we used cross-classified models to study rater effects in the real essay rating process of a large-scale, high-stakes educational examination administered in China in 2011. Generally, four cross-classified models are suggested for investigation of rate...
In observed-score equipercentile equating, the goal is to make scores on two scales or tests measuring the same construct comparable by matching the percentiles of the respective score distributions. If the tests consist of different items with multiple categories for each item, a suitable model for the responses is a polytomous item response theor...
Item response theory (IRT) observed-score kernel equating is introduced for the non-equivalent groups with anchor test equating design using either chain equating or post-stratification equating. The equating function is treated in a multivariate setting and the asymptotic covariance matrices of IRT observed-score kernel equating functions are deri...
We investigate the current bandwidth selection methods in kernel equating and propose a method based on Silverman's rule of thumb for selecting the bandwidth parameters. In kernel equating, the bandwidth parameters have previously been obtained by minimizing a penalty function. This minimization process has been criticized by practitioners for bein...
We study the implication of violations of the faithfulness condition due to parameter cancellations on estimation of the directed acyclic graph (DAG) skeleton. Three settings are investigated: when (i) faithfulness is guaranteed (ii) faithfulness is not guaranteed and (iii) the parameter distributions are concentrated around unfaithfulness (near-un...
In standardized testing it is important to equate tests in order to ensure that the test takers, regardless of the test version given, obtain a fair test. Recently, the kernel method of test equating, which is a conjoint framework of test equating, has gained popularity. The kernel method of test equating includes five steps: (1) pre-smoothing, (2)...