[show abstract][hide abstract] ABSTRACT: Linear mixed modeling is a useful approach for double mixed factorial designs with
covariates. It is explained how these designs are appropriate for the study of human behavior
as a function of characteristics of persons and situations and stimuli in the situations. The
behavior of subjects nested in types of persons responding to stimuli nested in types of stimuli
defines a mixed factorial design. The inclusion of additional covariates of the observational
units can help to further explain the behavior under study. A linear mixed modeling approach
for such designs allows a combined focus on fixed effects (general effects) and individual
and stimulus differences in these effects. This combination has the potential to advance the
integration of two different sub-disciplines of psychology: general psychology and differential
psychology, so that they can borrow strength from each other. An application is presented with
semantic categorization response time data from a factorial design with age groups by word
types and with age of acquisition as an additional covariate of the words. The results throw
light on the processes underlying the effect of age of acquisition and on individual differences
and word differences.
Journal of the Royal Statistical Society Series C Applied Statistics 09/2013; To appear.
[show abstract][hide abstract] ABSTRACT: In social network studies, most often only a single relation (or link) between the actors is investigated. When more than one link has been recorded, the two- way sociomatrix becomes a three-way array with the set of links being the third way. In this paper, we present a model which simultaneously accounts for the three ways in the data. Random effects are used to model the between-actor variability, both on senders and receivers side. In addition, structural relations between the linking variables are investigated. The model is applied to a study of popularity and strength in a class of students. It is shown that popularity can be seen as a linear function of strength on the receivers' side, but not on the senders' side.
[show abstract][hide abstract] ABSTRACT: Four methods for the simulation of the Wiener process with constant drift and variance are described. These four methods are
(1) approximating the diffusion process by a random walk with very small time steps; (2) drawing directly from the joint density
of responses and reaction time by means of a (possibly) repeated application of a rejection algorithm; (3) using a discrete
approximation to the stochastic differential equation describing the diffusion process; and (4) a probability integral transform
method approximating the inverse of the cumulative distribution function of the diffusion process. The four methods for simulating
response probabilities and response times are compared on two criteria: simulation speed and accuracy of the simulation. It
is concluded that the rejection-based and probability integral transform method perform best on both criteria, and that the
stochastic differential approximation is worst. An important drawback of the rejection method is that it is applicable only
to the Wiener process, whereas the probability integral transform method is more general.
Behavior Research Methods 04/2012; 33(4):443-456. · 2.12 Impact Factor
[show abstract][hide abstract] ABSTRACT: In Experiment 1, complex propositional reasoning problems were constructed as a combination of several types of logical inferences:
modus ponens, modus tollens, disjunctive modus ponens, disjunctive syllogism, and conjunction. Rule theories of propositional
reasoning can account for how one combines these inferences, but the difficulty of the problems can be accounted for only
if a differential psychological cost is allowed for different basic rules. Experiment 2 ruled out some alternative explanations
for these differences that did not refer to the intrinsic difficulty of the basic rules. It was also found that part of the
results could be accounted for by the notion of representational cost, as it is used in the mental model theory of propositional
reasoning. However, the number of models as a measure of representational cost seems to be too coarsely defined to capture
all of the observed effects. Frank Rijmen was supported by the Fund for Scientific Research, Flanders (FWO).
[show abstract][hide abstract] ABSTRACT: This paper analyzes the sum score based (SSB) formulation of the Rasch model, where items and sum scores of persons are considered
as factors in a logit model. After reviewing the evolution leading to the equality between their maximum likelihood estimates,
the SSB model is then discussed from the point of view of pseudo-likelihood and of misspecified models. This is then employed
to provide new insights into the origin of the known inconsistency of the difficulty parameter estimates in the Rasch model.
The main results consist of exact relationships between the estimated standard errors for both models; and, for the ability
parameters, an upper bound for the estimated standard errors of the Rasch model in terms of those for the SSB model, which
are more easily available.
KeywordsRasch model-standard error-information matrix-pseudo-likelihood
[show abstract][hide abstract] ABSTRACT: Responses to items from an intelligence test may be fast or slow. The research issue dealt with in this paper is whether the intelligence involved in fast correct responses differs in nature from the intelligence involved in slow correct responses. There are two questions related to this issue: 1. Are the processes involved different? 2. Are the abilities involved different? An answer to these questions is provided making use of data from a Raven-like matrices test and a verbal analogies test, and the use of a psychometric branching model. The branching model is based on three latent traits: speed, fast accuracy and slow accuracy, and item parameters corresponding to each of these. The pattern of item difficulties is used to draw conclusions on the cognitive processes involved. The results are as follows: 1. The processes involved in fast and slow responses can be differentiated, as can be derived from qualitative differences in the patterns of item difficulty, and fast responses lead to a larger differentiation between items than slow responses do. 2. The abilities underlying fast and slow responses can also be differentiated, and fast responses allow for a better differentiation between the respondents.
[show abstract][hide abstract] ABSTRACT: The identification of differential item functioning (DIF) is often performed by means of statistical approaches that consider the raw scores as proxys for the ability trait level. One of the most popular approaches, the Mantel-Haenszel (MH) method, belongs to this category. However, replacing the ability level by the simple raw score is a source of potential Type I error inflation, especially in the presence of DIF but also when DIF is absent and in the presence of impact. The purpose of this paper is to present an alternative statistical inference approach based on the same measure of DIF but such that the Type I error inflation is prevented. The key notion is that for DIF items, the measure has an outlying value which can be identified as such with inference tools from robust statistics. Although we use the MH log-odds ratio as a statistic, the inference is different. A simulation study is performed to compare the robust statistical inference with the classical inference method, both based on the MH statistic. As expected the Type I error rate inflation is avoided with the robust approach, while the power of the two methods is similar.
Educational and Psychological Measurement 01/2012; 72:291-311. · 1.07 Impact Factor
[show abstract][hide abstract] ABSTRACT: The identification of differential item functioning (DIF) is often performed by means of statistical approaches that consider the raw scores as proxies for the ability trait level. One of the most popular approaches, the Mantel–Haenszel (MH) method, belongs to this category. However, replacing the ability level by the simple raw score is a source of potential Type I error inflation, not only in the presence of DIF but also when DIF is absent and in the presence of impact. The purpose of this article is to present an alternative statistical inference approach based on the same measure of DIF but such that the Type I error inflation is prevented. The key notion is that for DIF items, the measure has an outlying value that can be identified as such with inference tools from robust statistics. Although we use the MH log odds ratio as a statistic, the inference is different. A simulation study is performed to compare the robust statistical inference with the classical inference method, both based on the MH statistic. As expected, the Type I error rate inflation is avoided with the robust approach, although the power of the two methods is similar.
Educational and Psychological Measurement 01/2012; 72(2):291-311. · 1.07 Impact Factor
[show abstract][hide abstract] ABSTRACT: Multiple item response profile (MIRP) models are models with crossed fixed and random effects. At least one between-person factor is crossed with at least one within-person factor, and the persons nested within the levels of the between-person factor are crossed with the items within levels of the within-person factor. Maximum likelihood estimation (MLE) of models for binary data with crossed random effects is challenging. This is because the marginal likelihood does not have a closed form, so that MLE requires numerical or Monte Carlo integration. In addition, the multidimensional structure of MIRPs makes the estimation complex. In this paper, three different estimation methods to meet these challenges are described: the Laplace approximation to the integrand; hierarchical Bayesian analysis, a simulation-based method; and an alternating imputation posterior with adaptive quadrature as the approximation to the integral. In addition, this paper discusses the advantages and disadvantages of these three estimation methods for MIRPs. The three algorithms are compared in a real data application and a simulation study was also done to compare their behaviour.
British Journal of Mathematical and Statistical Psychology 11/2011; 65(3):438-66. · 1.26 Impact Factor
[show abstract][hide abstract] ABSTRACT: An old issue in psychological assessment is to what extent power and speed each are measured by a given intelligence test. Starting from accuracy and response time data, an approach based on posterior time limits (cut-offs of recorded response time) leads to three kinds of recoded data: time data (whether or not the response precedes the cut-off), time-accuracy data (whether or not a response is correct and precedes the cut-off), and accuracy data (as time-accuracy data, but coded as missing when not preceding the time cut-off). Each type of data can be modeled as binary responses. Speed and power are investigated through the effect of posterior time limits on two main aspects: (a) the latent variable that is measured: whether it is more power-related or more speed-related; (b) how well the latent variable (of whatever kind) is measured through the item(s). As empirical data, we use responses and response times for a verbal analogies test. The main findings are that, independent of the posterior time limit, basically the same latent speed trait was measured through the time data, and basically the same latent power trait was measured through the accuracy data, while for the time-accuracy data the nature of the latent trait moved from power to speed when the posterior time limit was reduced. It was also found that a reduction of the posterior time limit had no negative effect on the reliability of the latent trait measures (of whatever kind).
[show abstract][hide abstract] ABSTRACT: The models used in this article are secondary dimension mixture models with the potential to explain differential item functioning (DIF) between latent classes, called latent DIF. The focus is on models with a secondary dimension that is at the same time specific to the DIF latent class and linked to an item property. A description of the models is provided along with a means of estimating model parameters using easily available software and a description of how the models behave in two applications. One application concerns a test that is sensitive to speededness and the other is based on an arithmetic operations test where the division items show latent DIF.
[show abstract][hide abstract] ABSTRACT: Standardized tests are used widely in comparative studies of clinical populations, either as dependent or control variables. Yet, one cannot always be sure that the test items measure the same constructs in the groups under study. In the present work, 460 participants with intellectual disability of undifferentiated etiology and 488 typical children were tested using Raven’s Colored Progressive Matrices (RCPM). Data were analyzed using binomial logistic regression modeling designed to detect differential item functioning (DIF). Results showed that 12 items out of 36 function differentially between the two groups, but only 2 items exhibit at least moderate DIF. Thus, a very large majority of the items have identical discriminative power and difficulty levels across the two groups. It is concluded that RCPM can be used with confidence in studies comparing participants with and without intellectual disability. In addition, it is suggested that methods for investigating internal bias of tests used in cross-cultural, cross-linguistic or cross gender comparisons should also be regularly employed in studies of clinical populations, particularly in the field of developmental disability, to show the absence of systematic measurement error (i.e. DIF) affecting item responses.
[show abstract][hide abstract] ABSTRACT: This paper focuses on the identification of differential item functioning (DIF) when more than two groups of examinees are considered. We propose to consider items as elements of a multivariate space, where DIF items are outlying elements. Following this approach, the situation of multiple groups is a quite natural case. A robust statistics technique is proposed to identify DIF items as outliers in the multivariate space. For low dimensionalities, up to two three groups, also a simple graphical tool is derived. We illustrate our approach with a re-analysis of data from Kim, Cohen, and Park (1995) on using calculators for a mathematics test.
Multivariate Behavioral Research 01/2011; 46:733-755. · 1.66 Impact Factor
[show abstract][hide abstract] ABSTRACT: Differential item functioning (DIF) is an important issue of interest in psychometrics and educational measurement. Several methods have been proposed in recent decades for identifying items that function differently between two or more groups of examinees. Starting from a framework for classifying DIF detection methods and from a comparative overview of the most traditional methods, an R package for nine methods, called difR, is presented. The commands and options are briefly described, and the package is illustrated through the analysis of a data set on verbal aggression.
Behavior Research Methods 08/2010; 42(3):847-62. · 2.12 Impact Factor
[show abstract][hide abstract] ABSTRACT: For the analysis of binary data, various deterministic models have been proposed, which are generally simpler to fit and easier to understand than probabilistic models. We claim that corresponding to any deterministic model is an implicit stochastic model in which the deterministic model fits imperfectly, with errors occurring at random. In the context of binary data, we consider two error models: in the first model, all predictions are equally likely to be in error; in the second model, the probability of error depends on the model prediction. We show how to fit these models using a stochastic modification of deterministic optimization schemes. The advantages of fitting the stochastic models explicitly (rather than implicitly, by simply fitting a deterministic model and accepting the occurrence of errors) include quantification of uncertainty in the deterministic model's parameter estimates, better estimation of the true model error rate, and the ability to check the fit of ...
[show abstract][hide abstract] ABSTRACT: In this article we present a new methodology for detecting Di erential Item Functioning (DIF). We
introduce a DIF model, called the Random Item Mixture (RIM), that is based on a Rasch model with random item difficulties (besides the common random person abilities). In addition, a mixture model is assumed for the item difficulties such that the items may belong to one of two classes: a DIF or a non-DIF class. The crucial di fference between the DIF class and the non-DIF class is that the item difficulties in the DIF class may di ffer according to the observed person groups while they are equal across the person groups for the items from the non-DIF class. Statistical inference for the RIM is carried out in a Bayesian framework. The performance of the RIM is evaluated using a simulation study in which it is compared with traditional procedures, like the Likelihood Ratio test, the Mantel-Haenszel procedure and the standardized p-DIF procedure. In this comparison, the RIM performs better than the other methods. Finally, the usefulness of the model is also demonstrated on a real life dataset.
Journal of Educational Measurement 01/2010; 47:432-457. · 1.00 Impact Factor
[show abstract][hide abstract] ABSTRACT: This study introduces an approach for modeling multidimensional response data with construct-relevant group and domain factors. The item level parameter estimation process is extended to incorporate the refined effects of test dimension and group factors. Differences in item performances over groups are evaluated, distinguishing two levels of differential item functioning (DIF): a domain level and an item level.An illustration is presented using a Dutch spelling proficiency scale administered to two subgroups. DIF is modeled by the interaction between group and item domain (domain level DIF) and by the interaction between groups and items within each domain (item level DIF). A set of item response theory models was estimated using an adaptation of the logistic regression approach. The model with domain specific item-by-group interactions or DIF performed better than the other models neglecting domain or group differences.The method appears to be promising in that explicit domain factors can be implemented into model estimation procedure to better understand why items favor a specific language group over another.
International Journal of Testing 04/2009; 9(2):151-166.
[show abstract][hide abstract] ABSTRACT: The article proposes a family of item-response models that allow the separate and independent specification of three orthogonal components: item attribute, person covariate, and local item dependence. Special interest lies in extending the linear logistic test model, which is commonly used to measure item attributes, to tests with embedded item clusters. The problem of local item dependence arises in item clusters. Existing methods for handling such dependence, however, often fail to satisfy the property of invariant marginal interpretation of the item attribute parameters. Although such a property may not be necessary for applications that focus on predictive analysis, it is critical for linear logistic test models. To achieve the marginal property, we implement an iterative estimation method, which is illustrated using data collected from an inventory on verbal aggressiveness.
[show abstract][hide abstract] ABSTRACT: Structural equation models are commonly used to analyze 2-mode data sets, in which a set of objects is measured on a set of variables. The underlying structure within the object mode is evaluated using latent variables, which are measured by indicators coming from the variable mode. Additionally, when the objects are measured under different conditions, 3-mode data arise, and with this, the simultaneous study of the correlational structure of 2 modes may be of interest. In this article the authors present a model with a simultaneous latent structure for 2 of the 3 modes of such a data set. They present an empirical illustration of the method using a 3-mode data set (person by situation by response) exploring the structure of anger and irritation across different interpersonal situations as well as across persons.
[show abstract][hide abstract] ABSTRACT: The increasing use of diary methods calls for the development of appropriate statistical methods. For the resulting panel data, latent Markov models can be used to model both individual differences and temporal dynamics. The computational burden associated with these models can be overcome by exploiting the conditional independence relations implied by the model. This is done by associating a probabilistic model with a directed acyclic graph, and applying transformations to the graph. The structure of the transformed graph provides a factorization of the joint probability function of the manifest and latent variables, which is the basis of a modified and more efficient E-step of the EM algorithm. The usefulness of the approach is illustrated by estimating a latent Markov model involving a large number of measurement occasions and, subsequently, a hierarchical extension of the latent Markov model that allows for transitions at different levels. Furthermore, logistic regression techniques are used to incorporate restrictions on the conditional probabilities and to account for the effect of covariates. Throughout, models are illustrated with an experience sampling methodology study on the course of emotions among anorectic patients.