Article

JOINT AND CONDITIONAL MAXIMUM LIKELIHOOD ESTIMATION FOR THE RASCH MODEL FOR BINARY RESPONSES

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

The usefulness of joint and conditional maximum-likelihood is considered for the Rasch model under realistic testing conditions in which the number of examinees is very large and the number is items is relatively large. Conditions for consistency and asymptotic normality are explored, effects of model error are investigated, measures of prediction are estimated, and generalized residuals are developed.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Within item-response analysis, applications of generalized residuals have been quite rare. Exceptions are found in the case of the Rasch model (Glas & Verhelst, 1989, 1995Haberman, 2004). Most residuals that have been employed in item-response analysis are standardized residuals of the form t 0 = (O −Ê)/s 0 , where s 0 is an estimate of the standard deviation of the observed linear combination O (Hambleton, Swaminathan, & Rogers, 1991). ...
... In the conditional estimation case of section 2, the Rasch version of the 1PL model is assumed, so that no restrictions are imposed on the distribution F . In this case, generalized residuals derived from log-linear models may be applied (Haberman, 1976(Haberman, , 1978(Haberman, , 1979(Haberman, , 2004, for the X i satisfy the log-linear model ...
... where S(x) = ∑ q j=1 x j is the sum of the coordinates of x, β is the q-dimensional vector of β j , 1 ≤ j ≤ q, and (Tjur, 1982). In this case, and in generalizations to the partial credit model, some special instances of generalized residuals have appeared in the literature (Glas & Verhelst, 1989;Haberman, 2004). In addition, the square of a generalized residual for the Rasch model yields a generalized Pearson test (Glas & Verhelst, 1995). ...
Article
Generalized residuals are a tool employed in the analysis of contingency tables to examine goodness of fit. They may be applied to item response models with little complication. Their use is illustrated with testing data from operational programs. Models considered include the Rasch model and the two-parameter logistic model.
... For q = 2, the result holds with δ = 0 and , an arbitrary real number. The result is consistent with the general observation that, for any integer q ≥ 2, if θ i is a bounded random variable, then, for any real constant c > 0, a real d > 0 exists such that |γ − β| < d/q whenever |β| < c (Haberman, 2004). ...
... For a given distribution of θ i and for a given β, the asymptotic limit γ can be computed with little difficulty by exploiting some techniques for efficient computation of probabilities of sums of independent Bernoulli random variables (Haberman, 2004). Computations also exploit standard methods to calculate maximum-likelihood estimates for logit models. ...
... . Computation of g kj (a) and h k (a) can be achieved by a recursive algorithm (Haberman, 2004). ...
Article
Techniques are developed for approximation and exact computation of the asymptotic limit of the item parameter estimates obtained by application of joint maximum-likelihood estimation to the Rasch model.
... Our theoretical framework assumes a diverging number of items, which is suitable when analyzing large scale data. To the best of our knowledge, such an asymptotic setting has not received enough attention, except in Haberman (1977Haberman ( , 2004 and Chiu et al. (2016). Our theoretical analysis applies to a general MIRT model that includes the multidimensional two-parameter logistic model (Reckase and McKinley, 1983;Reckase, 2009) as a special case, while the analyses in Haberman (1977Haberman ( , 2004 and Chiu et al. (2016) are limited to the unidimensional Rasch model (Rasch, 1960;Lord et al., 1968) and cognitive diagnostic models (Rupp et al., 2010), respectively. ...
... To the best of our knowledge, such an asymptotic setting has not received enough attention, except in Haberman (1977Haberman ( , 2004 and Chiu et al. (2016). Our theoretical analysis applies to a general MIRT model that includes the multidimensional two-parameter logistic model (Reckase and McKinley, 1983;Reckase, 2009) as a special case, while the analyses in Haberman (1977Haberman ( , 2004 and Chiu et al. (2016) are limited to the unidimensional Rasch model (Rasch, 1960;Lord et al., 1968) and cognitive diagnostic models (Rupp et al., 2010), respectively. Our technical tools for studying the properties of the CJMLE include theoretical developments in matrix completion theory (e.g. ...
Article
Full-text available
Multidimensional item response theory is widely used in education and psychology for measuring multiple latent traits. However, exploratory analysis of large-scale item response data with many items, respondents, and latent traits is still a challenge. In this paper, we consider a high-dimensional setting that both the number of items and the number of respondents grow to infinity. A constrained joint maximum likelihood estimator is proposed for estimating both item and person parameters, which yields good theoretical properties and computational advantage. Specifically, we derive error bounds for parameter estimation and develop an efficient algorithm that can scale to very large datasets. The proposed method is applied to a large scale personality assessment data set from the Synthetic Aperture Personality Assessment (SAPA) project. Simulation studies are conducted to evaluate the proposed method.
... Blackwood and Bradley (1989) provided a formal proof that the likelihood equations for models (1) and (2) have the same solution, which we write as JMLE=SSBE. More recently, Haberman (2004) reported the same result in terms of (3) and provided compelling evidence of the computational savings achieved by fitting this model using standard software, particularly for large data sets. He also suggested that the JMLE can be used as a starting point for a Newton Raphson algorithm that computes the CMLE. ...
... The importance of these new relationships is primarily theoretical, but they may have some practical value as well. As suggested by Haberman (2004), the SSBE could at least be used as initial points for other estimation procedures, and it is useful to have an idea of precision to compare the new estimates with the SSBE. ...
Article
Full-text available
This paper analyzes the sum score based (SSB) formulation of the Rasch model, where items and sum scores of persons are considered as factors in a logit model. After reviewing the evolution leading to the equality between their maximum likelihood estimates, the SSB model is then discussed from the point of view of pseudo-likelihood and of misspecified models. This is then employed to provide new insights into the origin of the known inconsistency of the difficulty parameter estimates in the Rasch model. The main results consist of exact relationships between the estimated standard errors for both models; and, for the ability parameters, an upper bound for the estimated standard errors of the Rasch model in terms of those for the SSB model, which are more easily available. KeywordsRasch model-standard error-information matrix-pseudo-likelihood
... This setting seems reasonable for analyzing large-scale item response data. Similar asymptotic settings have been considered in psychometric research, including the analysis of unidimensional IRT models (Haberman, 1977(Haberman, , 2004 and diagnostic classification models (Chiu et al., 2016). Under this asymptotic setting, we propose a constrained joint maximum likelihood estimator (CJMLE) that has certain notion of statistical consistency in recovering factor loadings. ...
Preprint
Joint maximum likelihood (JML) estimation is one of the earliest approaches to fitting item response theory (IRT) models. This procedure treats both the item and person parameters as unknown but fixed model parameters and estimates them simultaneously by solving an optimization problem. However, the JML estimator is known to be asymptotically inconsistent for many IRT models, when the sample size goes to infinity and the number of items keeps fixed. Consequently, in the psychometrics literature, this estimator is less preferred to the marginal maximum likelihood (MML) estimator. In this paper, we re-investigate the JML estimator for high-dimensional exploratory item factor analysis, from both statistical and computational perspectives. In particular, we establish a notion of statistical consistency for a constrained JML estimator, under an asymptotic setting that both the numbers of items and people grow to infinity and that many responses may be missing. A parallel computing algorithm is proposed for this estimator that can scale to very large datasets. Via simulation studies, we show that when the dimensionality is high, the proposed estimator yields similar or even better results than those from the MML estimator, but can be obtained computationally much more efficiently. An illustrative real data example is provided based on the revised version of Eysenck's Personality Questionaire (EPQ-R).
... For example, one stream of literature considers bias reduction in relevant model families (Firth, 1993;Firth, 2009, 2011;Kosmidis, 2014;Kenne Pagui et al., 2017;Kosmidis et al., 2020;Kosmidis and Firth, 2021). Another, independent, stream of investigation considers estimation in the context of Rasch models, typically considering both bias and predictive accuracy (Molenaar, 1995;Linacre, 2004;Haberman, 2004;Robitzsch, 2021). A third stream considers Bayesian estimation and the selection of an appropriate prior within a Bradley-Terry model (Davidson and Solomon, 1973;Leonard, 1977;Chen and Smith, 1984;Whelan, 2017;Phelan and Whelan, 2017). ...
Preprint
Full-text available
Comparative Judgement is an assessment method where item ratings are estimated based on rankings of subsets of the items. These rankings are typically pairwise, with ratings taken to be the estimated parameters from fitting a Bradley-Terry model. Likelihood penalization is often employed. Adaptive scheduling of the comparisons can increase the efficiency of the assessment. We show that the most commonly used penalty is not the best-performing penalty under adaptive scheduling and can lead to substantial bias in parameter estimates. We demonstrate this using simulated and real data and provide a theoretical explanation for the relative performance of the penalties considered. Further, we propose a superior approach based on bootstrapping. It is shown to produce better parameter estimates for adaptive schedules and to be robust to variations in underlying strength distributions and initial penalization method.
... /fpsyg. . likelihood estimation (JMLE), which estimates item and person parameters simultaneously, CMLE considers individual sum scores to be sufficient statistics regarding the person parameter θ i , allowing the item parameters to be estimated first and then used to estimate the person parameters in a second step (Haberman, 2004). ...
Article
Full-text available
Besides teachers' professional knowledge, their self-efficacy is a crucial aspect in promoting students' scientific reasoning (SR). However, because no measurement instrument has yet been published that specifically refers to self-efficacy beliefs regarding the task of teaching SR, we adapted the Science Teaching Efficacy Belief Instrument (STEBI) accordingly, resulting in the Teaching Scientific Reasoning Efficacy Beliefs Instrument (TSR-EBI). While the conceptual framework of the TSR-EBI is comparable to that of the STEBI in general terms, it goes beyond it in terms of specificity, acknowledging the fact that teaching SR requires very specific knowledge and skills that are not necessarily needed to the same extent for promoting other competencies in science education. To evaluate the TSR-EBI's psychometric quality, we conducted two rounds of validation. Both samples (N1 = 114; N2 = 74) consisted of pre-service teachers enrolled in university master's programs in Germany. The collected data were analyzed by applying Rasch analysis and known-group comparisons. In the course of an analysis of the TSR-EBI's internal structure, we found a 3-category scale to be superior to a 5-category structure. The person and item reliability of the scale proved to be satisfactory. Furthermore, during the second round of validation, it became clear that the results previously found for the 3-category scale were generally replicable across a new (but comparable) sample, which clearly supports the TSR-EBI's psychometric quality. Moreover, in terms of test-criterion relationships, the scale was also able to discriminate between groups that are assumed to have different levels of self-efficacy regarding teaching SR. Nonetheless, some findings also suggest that the scale might benefit from having the selection of individual items reconsidered (despite acceptable item fit statistics). On balance, however, we believe that the TSR-EBI has the potential to provide valuable insights in future studies regarding factors that influence teachers' self-efficacy, such as their professional experiences, prior training, or perceived barriers to effective teaching.
... In our setting, randomly assignment of questions to students can be seen as a special case of missing data. With complete data, the condition for the consistency of the maximum likelihood estimators is analyzed [Haberman, 1977[Haberman, , 2004. With missing data, though plenty of works on simulation exists, there is a lack of theoretical work that proves mathematically the consistency of the maximum likelihood estimators. ...
Preprint
This paper studies grading algorithms for randomized exams. In a randomized exam, each student is asked a small number of random questions from a large question bank. The predominant grading rule is simple averaging, i.e., calculating grades by averaging scores on the questions each student is asked, which is fair ex-ante, over the randomized questions, but not fair ex-post, on the realized questions. The fair grading problem is to estimate the average grade of each student on the full question bank. The maximum-likelihood estimator for the Bradley-Terry-Luce model on the bipartite student-question graph is shown to be consistent with high probability when the number of questions asked to each student is at least the cubed-logarithm of the number of students. In an empirical study on exam data and in simulations, our algorithm based on the maximum-likelihood estimator significantly outperforms simple averaging in prediction accuracy and ex-post fairness even with a small class and exam size.
... Joint maximum likelihood (JML [25,45]) methods treat person abilities θ p as fixed effects. In JML, the vector of person parameters γ = (θ 1 , . . . ...
Article
Full-text available
The Rasch model is one of the most prominent item response models. In this article, different item parameter estimation methods for the Rasch model are systematically compared through a comprehensive simulation study: Different alternatives of joint maximum likelihood (JML) estimation, different alternatives of marginal maximum likelihood (MML) estimation, conditional maximum likelihood (CML) estimation, and several limited information methods (LIM). The type of ability distribution (i.e., nonnormality), the number of items, sample size, and the distribution of item difficulties were systematically varied. Across different simulation conditions, MML methods with flexible distributional specifications can be at least as efficient as CML. Moreover, in many situations (i.e., for long tests), penalized JML and JML with ε adjustment resulted in very efficient estimates and might be considered alternatives to JML implementations currently used in statistical software. Moreover, minimum chi-square (MINCHI) estimation was the best-performing LIM method. These findings demonstrate that JML estimation and LIM can still prove helpful in applied research.
... Given this property of Rasch models, joint maximum likelihood (JML), or unconditional maximum likelihood (Wright & Stone, 1979), is also available and is employed by other Rasch software, such as WINSTEPS (Linacre, 2019b). However, JML cannot produce estimates for individuals with sum scores of zero or the maximum possible score (e.g., all items incorrect or all items correct), and the estimates produced are inconsistent and asymptotically normal (Andersen, 1973;Haberman, 1977Haberman, , 2004. Additionally, the limitations of JML are due to the simultaneous estimation of item and person measures. ...
Article
Full-text available
The “extended Rasch modeling” (eRm) package in R provides users with a comprehensive set of tools for Rasch modeling for scale evaluation and general modeling. We provide a brief introduction to Rasch modeling followed by a review of literature that utilizes the eRm package. Then, the key features of the eRm package for scale evaluation are reviewed. The ease of use and the advantages of the R environment make the eRm package an appealing, free option for rigorous Rasch modeling. We demonstrate the functionality with a small example using data from the Rosenberg Self-Esteem scale. Using these data we show how to 1) fit dichotomous and polytomous Rasch models, 2) obtain key figures (e.g., item characteristic curves and person-item maps) to aid model assessment, 3) assess dimensionality and 4) obtain item fit statistics.
... Unfortunately, however, the JML estimator is typically statistically inconsistent (Neyman & Scott, 1948;Andersen, 1973;Haberman, 1977;Ghosh, 1995), except under some highdimensional asymptotic regime that is suitable for large-scale applications (Chen et al., 2019a,b;Haberman, 1977Haberman, , 2004. Treating both latent variables and parameters as random variables, the third perspective leads to a full Bayesian estimator, for which many Markov chain Monte Carlo (MCMC) algorithms have been developed (e.g., Béguin & Glas, 2001;Bolt & Lall, 2003;Dunson, 2000Dunson, , 2003Edwards, 2010). ...
Preprint
Latent variable models have been playing a central role in psychometrics and related fields. In many modern applications, the inference based on latent variable models involves one or several of the following features: (1) the presence of many latent variables, (2) the observed and latent variables being continuous, discrete, or a combination of both, (3) constraints on parameters, and (4) penalties on parameters to impose model parsimony. The estimation often involves maximizing an objective function based on a marginal likelihood/pseudo-likelihood, possibly with constraints and/or penalties on parameters. Solving this optimization problem is highly non-trivial, due to the complexities brought by the features mentioned above. Although several efficient algorithms have been proposed, there lacks a unified computational framework that takes all these features into account. In this paper, we fill the gap. Specifically, we provide a unified formulation for the optimization problem and then propose a quasi-Newton stochastic proximal algorithm. Theoretical properties of the proposed algorithms are established. The computational efficiency and robustness are shown by simulation studies under various settings for latent variable model estimation.
... The regularity conditions and the consistency result are formally described in Theorem 1, with two special cases discussed in the sequel. Similar double asymptotic settings have been considered in psychometric research, including the analyses of unidimensional IRT models (Haberman 1977;2004) and diagnostic classification models (Chiu et al. 2016). The following regularity conditions are needed for our main result in Theorem 1. ...
Article
Full-text available
We revisit a singular value decomposition (SVD) algorithm given in Chen et al. (Psychometrika 84:124–146, 2019b) for exploratory item factor analysis (IFA). This algorithm estimates a multidimensional IFA model by SVD and was used to obtain a starting point for joint maximum likelihood estimation in Chen et al. (2019b). Thanks to the analytic and computational properties of SVD, this algorithm guarantees a unique solution and has computational advantage over other exploratory IFA methods. Its computational advantage becomes significant when the numbers of respondents, items, and factors are all large. This algorithm can be viewed as a generalization of principal component analysis to binary data. In this note, we provide the statistical underpinning of the algorithm. In particular, we show its statistical consistency under the same double asymptotic setting as in Chen et al. (2019b). We also demonstrate how this algorithm provides a scree plot for investigating the number of factors and provide its asymptotic theory. Further extensions of the algorithm are discussed. Finally, simulation studies suggest that the algorithm has good finite sample performance.
... The regularity conditions and the consistency result are formally described in Theorem 1, with two special cases discussed in the sequel. Similar double asymptotic settings have been considered in psychometric research, including the analysis of unidimensional IRT models (Haberman, 1977(Haberman, , 2004 and diagnostic classification models (Chiu et al., 2016). The following regularity conditions are needed for our main result in Theorem 1. ...
Preprint
In this note, we revisit a singular value decomposition (SVD) based algorithm that was given in Chen et al. (2019a) for obtaining an initial value for joint maximum likelihood estimation of exploratory Item Factor Analysis (IFA). This algorithm estimates a multidimensional item response theory model by SVD. Thanks to the computational efficiency and scalability of SVD, this algorithm has substantial computational advantage over other exploratory IFA algorithms, especially when the numbers of respondents, items, and latent dimensions are all large. Under the same double asymptotic setting and notion of consistency as in Chen et al. (2019a), we show that this simple algorithm provides a consistent estimator for the loading matrix up to a rotation. This result provides theoretical guarantee to the use of this simple algorithm for exploratory IFA.
... Theoretical comparisons of CML and JML have been made on aspects such as existence and uniqueness of solution (Fischer 1981), construction of approximate confidence intervals (Gilula and Haberman 1995), large sample properties (Haberman 1977) and behavior of estimators and generalized residuals (Haberman 2004). From these comparisons, it is generally agreed-upon that JML is biased in estimating the usual (non-mixture) Rasch model (Kim 2001), although it is easy to implement, whereas CML has an advantage of conditional independence between item difficulties and abilities. ...
Article
The mixture Rasch model is gaining popularity as it allows items to perform differently across subpopulations and hence addresses the violation of the unidimensionality assumption with traditional Rasch models. This study focuses on comparing two common maximum likelihood methods for estimating such models using Monte Carlo simulations. The conditional maximum likelihood (CML) and joint maximum likelihood (JML) estimations, as implemented in three popular R packages are compared by evaluating parameter recovery and class accuracy. The results suggest that in general, CML is preferred in parameter recovery and JML is preferred in identifying the correct number of classes. A set of guidelines is also provided regarding how sample sizes, test lengths or actual class probabilities affect the accuracy of estimation and number of classes, as well as how different information criteria compare in achieving class accuracy. Specific issues regarding the performance of particular R packages are highlighted in the study as well.
... Applications to data from NAEP were presented, and results of the proposed method were compared to results obtained using the current operational procedures. Haberman (2004) discussed joint and conditional ML estimation for the dichotomous Rasch model, explored conditions for consistency and asymptotic normality, investigated effects of model error, estimated errors of prediction, and developed generalized residuals. The same author (Haberman 2005a) showed that if a parametric model for the ability distribution is not assumed, the 2PL and 3PL (but not 1PL) models have identifiability problems that impose restrictions on possible models for the ability distribution. ...
Chapter
Full-text available
Few would doubt that researchers at ETS have contributed more to the general topic of item response theory (IRT) than individuals from any other institution. In this chapter, we review most of those contributions, dividing them into sections by decades of publication. The history of IRT begins before the seminal volume by Lord and Novick (Statistical Theories of Mental Test Scores, Addison-Wesley, Reading, 1968) and ETS researchers were central contributors to those developments, beginning with early work by Fred Lord and Bert Green in the 1950s. The chapter traces a wide range of contributions through the decades, ending with recent work that produced models involving complex latent variable structures and multiple dimensions.
... For estimation of parameters, currently MML estimation of item parameters may be a standard method, where the standard asymptotic theory can be used for, e.g., the approximate standard errors of estimated item parameters with fixed n. On the other hand, for the fixed-effects model when joint maximum likelihood (JML) estimation of the item and ability parameters is used, it has been claimed that both n and N should be increased for, e.g., consistency of parameter estimators (Haberman 1977(Haberman , 2004Trachtenberg 2000). The latter condition that is seldom satisfied in practice is also associated with the problem of incidental parameters (Neyman and Scott 1948). ...
Article
The 3-parameter logistic (3PL) model including guessing parameters is one of the popular models in item response theory. While the guessing parameters in the fixed-effects 3PL model with non-stochastic abilities have been believed to have identification, some counter examples with new ones given in this paper are currently available. In this paper, the concept of degeneracy for nested models is introduced. Some degenerate cases in the counter examples are shown to have model identification for guessing parameters, which are further shown to have model unidentification in a more degenerate model. Similar results in the fixed-effects 4-parameter logistic model are also derived.
... Even in the case of the normal 2PL model, an interaction model (Haberman, 2004) yields a value ofĤ of 0.59112. Here the model used assumes that p is in S and log p(x) = τ s − q j=1 (β j + sγ j ) for x in Γ and q j=1 x j = s for some real τ s , 0 ≤ s ≤ q, β j , 1 ≤ j ≤ q, and γ j , 1 ≤ j ≤ q. ...
Article
Adaptive quadrature is applied to marginal maximum likelihood estimation for item response models with normal ability distributions. Even in one dimension, significant gains in speed and accuracy of computation may be achieved.
... Applications to data from NAEP were presented, and results of the proposed method were compared to results obtained using the current operational procedures. Haberman (2004) discussed joint and conditional ML estimation for the dichotomous Rasch model, explored conditions for consistency and asymptotic normality, investigated effects of model error, estimated errors of prediction, and developed generalized residuals. The same author (Haberman, 2005a) showed that if a parametric model for the ability distribution is not assumed, the 2PL and 3PL (but not 1PL) models have identifiability problems that impose restrictions on possible models for the ability distribution. ...
Article
Few would doubt that ETS researchers have contributed more to the general topic of item response theory (IRT) than individuals from any other institution. In this report, we briefly review most of those contributions, dividing them into sections by decades of publication, beginning with early work by Fred Lord and Bert Green in the 1950s and ending with recent work that produced models involving complex structures and multiple dimensions.
Article
Full-text available
Latent variable models have been playing a central role in psychometrics and related fields. In many modern applications, the inference based on latent variable models involves one or several of the following features: (1) the presence of many latent variables, (2) the observed and latent variables being continuous, discrete, or a combination of both, (3) constraints on parameters, and (4) penalties on parameters to impose model parsimony. The estimation often involves maximizing an objective function based on a marginal likelihood/pseudo-likelihood, possibly with constraints and/or penalties on parameters. Solving this optimization problem is highly non-trivial, due to the complexities brought by the features mentioned above. Although several efficient algorithms have been proposed, there lacks a unified computational framework that takes all these features into account. In this paper, we fill the gap. Specifically, we provide a unified formulation for the optimization problem and then propose a quasi-Newton stochastic proximal algorithm. Theoretical properties of the proposed algorithms are established. The computational efficiency and robustness are shown by simulation studies under various settings for latent variable model estimation.
Article
The property of item parameter invariance in item response theory (IRT) plays a pivotal role in the applications of IRT such as test equating. The scope of parameter invariance when using estimates from finite biased samples in the applications of IRT does not appear to be clearly documented in the IRT literature. This article provides information on the extent to which item parameter invariance is observed in samples with the Rasch and 2-parameter model calibrations through simulations, where the behaviors of item parameter estimates were examined under 12 different types of convenient sampling scenarios. The results indicated that the property of item invariance in IRT for dichotomously scored data could hold for the sample item parameter estimates, regardless of biased samples, when the model holds in the data, the number of items in a test is not small, and the sample size is large. [Full text access: https://www.tandfonline.com/eprint/XVQVUSPIBNTW8GUBDVXN/full?target=10.1080/15366367.2020.1754703]
Article
Examples of the impact of statistical theory on assessment practice are provided from the perspective of a statistician trained in theoretical statistics who began to work on assessments. Goodness of fit of item‐response models is examined in terms of restricted likelihood‐ratio tests and generalized residuals. Minimum discriminant information adjustment is used for linking with no anchors or problematic anchors and for repeater analysis. Assessment issues are examined in cases in which the number of parameters is large relative to the number of observations.
Article
Fisher's classical likelihood has become the standard procedure to make inference for fixed unknown parameters. Recently, inferences of unobservable random variables, such as random effects, factors, missing values, etc., have become important in statistical analysis. Because Fisher's likelihood cannot have such unobservable random variables, the full Bayesian method is only available for inference. An alternative likelihood approach is proposed by Lee and Nelder. In the context of Fisher likelihood, the likelihood principle means that the likelihood function carries all relevant information regarding the fixed unknown parameters. Bjørnstad extended the likelihood principle to extended likelihood principle; all information in the observed data for fixed unknown parameters and unobservables are in the extended likelihood, such as the h‐likelihood. However, it turns out that the use of extended likelihood for inferences is not as straightforward as the Fisher likelihood. In this paper, we describe how to extract information of the data from the h‐likelihood. This provides a new way of statistical inferences in entire fields of statistical science. This article is categorized under: • Statistical Models > Generalized Linear Models • Algorithms and Computational Methods > Maximum Likelihood Methods
Article
Full-text available
Rasch-type item response models are often estimated via a conditional maximum-likelihood approach. This article elaborates on the asymptotics of conditional maximum-likelihood estimates for an increasing number of items, important for modern data settings where a large number of items need to be scaled. Using approximations of the variance–covariance matrix based on Edgeworth expansions, the problem is studied theoretically as well as computationally. In a subsequent step, these results are used to split the large-scale estimation problem into smaller sub-problems involving blocks of items. These item blocks are estimated separately from each other and, finally, merged back into the full parameter vector (divide-and-conquer). By means of simulation studies, and in conjunction with the asymptotic results, it was found that block sizes in the range of 30–40 items approximate the full-scale estimators with a negligible loss in precision. It is also shown how varying block sizes affect the running time needed to fit the model.
Article
Marginal likelihood-based methods are commonly used in factor analysis for ordinal data. To obtain the maximum marginal likelihood estimator, the full information maximum likelihood (FIML) estimator uses the (adaptive) Gauss–Hermite quadrature or stochastic approximation. However, the computational burden increases rapidly as the number of factors increases, which renders FIML impractical for large factor models. Another limitation of the marginal likelihood-based approach is that it does not allow inference on the factors. In this study, we propose a hierarchical likelihood approach using the Laplace approximation that remains computationally efficient in large models. We also proposed confidence intervals for factors, which maintains the level of confidence as the sample size increases. The simulation study shows that the proposed approach generally works well.
Article
Joint maximum likelihood estimation (JMLE) is developed for diagnostic classification models (DCMs). JMLE has been barely used in Psychometrics because JMLE parameter estimators typically lack statistical consistency. The JMLE procedure presented here resolves the consistency issue by incorporating an external, statistically consistent estimator of examinees’ proficiency class membership into the joint likelihood function, which subsequently allows for the construction of item parameter estimators that also have the consistency property. Consistency of the JMLE parameter estimators is established within the framework of general DCMs: The JMLE parameter estimators are derived for the Loglinear Cognitive Diagnosis Model (LCDM). Two consistency theorems are proven for the LCDM. Using the framework of general DCMs makes the results and proofs also applicable to DCMs that can be expressed as submodels of the LCDM. Simulation studies are reported for evaluating the performance of JMLE when used with tests of varying length and different numbers of attributes. As a practical application, JMLE is also used with “real world” educational data collected with a language proficiency test.
Article
If a parametric model for the ability distribution is not assumed, then the customary two-parameter and three-parameter logistic models for item response analysis present identifiability problems not encountered with the Rasch model. These problems impose substantial restrictions on possible models for ability distributions.
Article
Multinomial-response models are available that correspond implicitly to tests in which a total score is computed as the sum of polytomous item scores. For these models, joint and conditional estimation may be considered in much the same way as for the Rasch model for right-scored tests. As in the Rasch model, joint estimation is only attractive if both the number of items and the number of examinees are large, while conditional estimation can be employed for a large number of examinees whether or not the number of items is large. In neither case is computation difficult given currently available computers. Large-sample results favor use of conditional estimation, although some use of joint estimation can be contemplated if the number of items is large.
Article
Latent-class item response models with small numbers of latent classes are quite competitive in terms of model fit to corresponding item-response models, at least for one- and two-parameter logistic (1PL and 2PL) models. Provided that care is taken in terms of computational procedures and in terms of use of only limited numbers of latent classes, computations are relatively simple in the case of latent classes.
Article
Full-text available
The paper offers a general review of the basic concepts of both statistical model and parameter identification, and revisit the conceptual relationships between parameter identification and both parameter interpretability and properties of parameter estimates. All these issues are then exemplified for the 1PL, 2PL and 1PL-G fixed-effects models. For the 3PL model, however, we provide a theorem proving that the item parameters are not identified, do not have an empirical interpretation and that it is not possible to obtain consistent and unbiased estimates of them
Article
Full-text available
Necessary and sufficient conditions for the existence and uniqueness of a solution of the so-called unconditional (UML) and the conditional (CML) maximum-likelihood estimation equations in the dichotomous Rasch model are given. The basic critical condition is essentially the same for UML and CML estimation. For complete data matricesA, it is formulated both as a structural property ofA and in terms of the sufficient marginal sums. In case of incomplete data, the condition is equivalent to complete connectedness of a certain directed graph. It is shown how to apply the results in practical uses of the Rasch model.
Book
Ronald K. Hambleton; H. Swaminathan; H. Jane Rogers., The following values have no corresponding Zotero field: Label: B496 ID - 337
Article
Conditional log-linear models are developed for panel data and used to predict sequences of categorical responses. The class of models considered includes conventional Markov models and independence models as well as distance models in which all previous responses and present and past values of covariates are used to predict the current response. The approach taken in this article has some advantages over the marginal modeling approach that has become popular for longitudinal studies. Quality of prediction is measured by using a logarithmic penalty function. Given a model, conditional probabilities of responses consistent with the model are selected to provide the smallest expected penalty. This minimum expected penalty provides a measure of the predictive power of a model. Models are compared through their predictive power, as measured by the proportional reduction in expected penalty. Ways of incorporating the number of parameters of the competing models are discussed. This emphasis on predictive power contrasts with the conventional emphasis on goodness-of-fit tests. In the case of random sampling, estimates are provided for optimal prediction functions consistent with the model and for measures of predictive power. Large-sample approximations are provided for assessing the accuracy of parameter estimates and of estimated measures of quality of prediction. For measures of quality of prediction, assessments are provided for the bias of estimates. To illustrate techniques, analyses are performed on data from the National Longitudinal Study of Youth on attitudes toward a military career. Because these data are available for the same subject for each of seven years, and because demographic data are available on individual subjects, these data provide a nontrivial application. Because more than 8,000 observations are available, statistical models of practical interest do not fit the data according to conventional criteria, but they still have value in predicting subject responses. Analysis of the data shows that subjects' responses are linked much more closely to their previous responses than to demographic variables. A common Markov model for subject responses is shown to be inferior to other models in terms of predictive power. Methods considered are shown to apply to cases in which censoring or nonresponse problems exist.
Article
A recursive equation for computing higher-order derivatives of the elementary symmetric functions in the Rasch model is proposed. The formula is conceptually simple and relatively more efficient than the sum algorithm (Gustafsson, 1980). A simulation study indicated that the proposed formula has a small loss in accuracy, compared to the sum algorithm, for computing higher-order derivatives when tests contained 60 items or less. Index terms: difference algorithm, elementary symmetric functions, Formann's equation, Jansen's equation, Rasch model, sum algorithm.
Article
Estimation procedures for the item parameters in the Rasch model for dichotomous items are briefly reviewed, and it is concluded that the statistically correct conditional maximum likelihood (CML) method has not been used because of numerical problems. A solution to these problems is presented which allows a rapid computation of the CML estimates also for long tests. It is pointed out that there may be only small differences between the numerical values of the CML estimates and other estimates, but that the CML approach has decisive advantages in the construction of statistical tests of goodness of fit.
Article
The Rasch model for item analysis is an important member of the class of exponential response models in which the number of nuisance parameters increases with the number of subjects, leading to the failure of the usual likelihood methodology. Both conditional-likelihood methods and mixture-model techniques have been used to circumvent these problems. In this article, we show that these seemingly unrelated analyses are in fact closely linked to each other, despite dramatic structural differences between the classes of models implied by each approach. We show that the finite-mixture model for J dichotomous items having T latent classes gives the same estimates of item parameters as conditional likelihood on a set whose probability approaches one if T ≥ (J + 1)/2. Unconditional maximum likelihood estimators for the finite-mixture model can be viewed as Keifer-Wolfowitz estimators for the random-effects version of the Rasch model. Latent-class versions of the model are especially attractive when T is small relative to J. We analyze several sets of data, propose simple diagnostic checks, and discuss procedures for assigning scores to subjects based on posterior means. A flexible and general methodology for item analysis based on latent class techniques is proposed.
Article
In this paper we discuss the numerical solution of a set of conditional likelihood equations essential for the statistical analysis of psychological questionnaires. The model underlying this type of statistical analysis is described in detail and the conditional maximum‐likelihood method that is used for estimating item parameters is discussed in relation to the chosen model. The conditional likelihood equations contain a generalized version of the elementary symmetric functions. The main problem in solving the estimation equations consists in finding a rapid recursive way to compute these functions. Such a procedure is described in the paper. In the final section of the paper the developed method is illustrated by a numerical example.
Article
The Rasch model is an item analysis model with logistic item characteristic curves of equal slope,i.e. with constant item discriminating powers. The proposed goodness of fit test is based on a comparison between difficulties estimated from different scoregroups and over-all estimates. Based on the within scoregroup estimates and the over-all estimates of item difficulties a conditional likelihood ratio is formed. It is shown that—2 times the logarithm of this ratio isx 2-distributed when the Rasch model is true. The power of the proposed goodness of fit test is discussed for alternative models with logistic item characteristic curves, but unequal discriminating items from a scholastic aptitude test.
Article
Incluye bibliografía e índice v.1. Introductory topics -- v.2. New developments
Article
It is shown that, under usual regularity conditions, the maximum likelihood estimator of a structural parameter is strongly consistent, when the (infinitely many) incidental parameters are independently distributed chance variables with a common unknown distribution function. The latter is also consistently estimated although it is not assumed to belong to a parametric class. Application is made to several problems, in particular to the problem of estimating a straight line with both variables subject to error, which thus after all has a maximum likelihood solution.
Article
Consider a sample of size n from a regular exponential family in pnp_n dimensions. Let θ^n\hat\theta_n denote the maximum likelihood estimator, and consider the case where pnp_n tends to infinity with n and where {θn}\{\theta_n\} is a sequence of parameter values in RpnR^{p_n}. Moment conditions are provided under which θ^nθn=Op(pn/n)\|\hat\theta_n - \theta_n\| = O_p(\sqrt{p_n/n}) and θ^nθnXn=Op(pn/n)\|\hat\theta_n - \theta_n - \overline{X}_n\| = O_p (p_n/n), where Xn\overline{X}_n is the sample mean. The latter result provides normal approximation results when pn2/n0p^2_n/n \rightarrow 0. It is shown by example that even for a single coordinate of (θ^nθn),pn2/n0(\hat\theta_n - \theta_n), p^2_n/n \rightarrow 0 may be needed for normal approximation. However, if pn3/2/n0p^{3/2}_n/n \rightarrow 0, the likelihood ratio test statistic Λ\Lambda for a simple hypothesis has a chi-square approximation in the sense that (2logΛpn)/2pnDN(0,1)(-2 \log \Lambda - p_n)/\sqrt{2p_n} \rightarrow_D \mathscr{N}(0, 1).
Article
Simplified conditions are given for consistency and asymptotic normality of M-estimates derived by maximization of averages of independent identically distributed random concave functions. Applications are made to maximum likelihood estimation.
Article
In the case of frequency data, traditional discussions such as Rao (1973, pages 355-363, 391-412) consider asymptotic properties of maximum likelihood estimates and chi-square statistics under the assumption that all expected cell frequencies become large. If log-linear models are applied, these asymptotic properties may remain applicable if the sample size is large and the number of cells in the table is large, even if individual expected cell frequencies are small. Conditions are provided for asymptotic normality of linear functionals of maximum-likelihood estimates of log-mean vectors and for asymptotic chi-square distributions of Pearson and likelihood ratio chi-square statistics.
Article
Exponential response models are a generalization of logit models for quantal responses and of regression models for normal data. In an exponential response model, {F(θ):θΘ}\{F(\theta): \theta \in \Theta\} is an exponential family of distributions with natural parameter θ\theta and natural parameter space ΘV\Theta \subset V, where V is a finite-dimensional vector space. A finite number of independent observations Si,iIS_i, i \in I, are given, where for iI,Sii \in I, S_i has distribution F(θi)F(\theta_i). It is assumed that θ={θi:iI}\mathbf{\theta} = \{\theta_i: \mathbf{i} \in \mathbf{I}\} is contained in a linear subspace. Properties of maximum likelihood estimates \hat\mathbf{\theta} of θ\mathbf{\theta} are explored. Maximum likelihood equations and necessary and sufficient conditions for existence of \hat\mathbf{\theta} are provided. Asymptotic properties of \hat\mathbf{\theta} are considered for cases in which the number of elements in I becomes large. Results are illustrated by use of the Rasch model for educational testing.
Article
Prediction of categorical responses in panel studies is considered. Prediction functions based on general conditional log\log-linear models are investigated for statistical properties both from a population perspective and a sampling perspective. Problems such as existence and uniqueness of optimal prediction functions are addressed, and basic properties of measures of prediction quality are examined. Estimation, consistency and asymptotic normality are studied for the proposed parameter estimates and measures of prediction quality.
Article
Thesis--Copenhagen. Summary in Danish. Bibliography: p. 210-219.
Article
Inaug.-Diss.--Uppsala. Extracted from Acta mathematica, v. 77. Bibliography: p. 123-125.
Article
The problem of characterizing the manifest probabilities of a latent trait model is considered. The item characteristic curve is transformed to the item passing-odds curve and a corresponding transformation is made on the distribution of ability. This results in a useful expression for the manifest probabilities of any latent trait model. The result is then applied to give a characterization of the Rasch model as a log-linear model for a 2 J -contingency table. Partial results are also obtained for other models. The question of the identifiability of “guessing” parameters is also discussed.
Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research
  • G Rasch
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research.