Seock-Ho Kim

Seock-Ho Kim
University of Georgia | UGA · Department of Educational Psychology

About

79
Publications
11,532
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,122
Citations
Introduction
Skills and Expertise

Publications

Publications (79)
Article
In the original publication of the article, Appendix A was published incorrectly.
Chapter
A review of various priors used in Bayesian estimation under the two-parameter logistic model is presented together with clear mathematical definitions of the prior distributions. Examples that compared Bayesian estimation methods are presented using empirical data. The effects of the priors and their specifications on both item and ability paramet...
Article
This paper contrasts three methods of item analysis for multiple-choice items based on classical test theory, generalized linear modeling, and item response theory. Illustrations of the methods are presented with contrived and real data. Specifically, the methods, respectively, use a cross-classification table under classical test theory, a new bas...
Article
Mixture Rasch (MixRasch) models conventionally assume normal distributions for latent ability. Previous research has shown that the assumption of normality is often unmet in educational and psychological measurement. When normality is assumed, asymmetry in the actual latent ability distribution has been shown to result in extraction of spurious lat...
Article
Full-text available
Parameter estimation of Item Response Theory (IRT) models can be applied using both Bayesian and non-Bayesian methods. Although maximum likelihood estimation (MLE), a non-Bayesian method, has predominated since the 1970s, there is an increasing use of Bayesian methods, due to their capability for estimating complex models and for their implementati...
Article
Full-text available
Item response theory modeling articles from 83 years of Psychometrika are sorted based on the taxonomy by Thissen and Steinberg (1986). Results from 377 research and review articles indicate that the usual unidimensional parametric item response theory models for dichotomous items were employed in 51 per cent of the articles. The usual unidimension...
Article
Full-text available
Covariates have been used in mixture IRT models to help explain why examinees are classed into different latent classes. Previous research has considered manifest variables as covariates in a mixture Rasch analysis for prediction of group membership. Latent covariates, however, are more likely to have higher correlations with the latent class varia...
Chapter
The main aim of this study is to report on the frequency of which different item response theory models are employed in Psychometrika articles. Articles relevant to item response theory modeling in Psychometrika for 82 years (1936–2017) are sorted based on the classification framework by Thissen and Steinberg (Item response theory: Parameter estima...
Article
A review of various priors used in Bayesian estimation under the Rasch model is presented together with clear mathematical definitions of the hierarchical prior distributions. A Bayesian estimation method, Gibbs sampling, was compared with conditional, marginal, and joint maximum likelihood estimation methods using the Knox Cube Test data under the...
Article
The primary purpose of the study was to determine whether atypical knee biomechanics are exhibited during landing on an inverted surface. A seven-camera motion analysis system and two force plates were used to collect lower extremity biomechanics from two groups of female participants: 21 subjects with chronic ankle instability (CAI) and 21 with pa...
Article
Full-text available
A brief explication of the implementation of the Gibbs sampling method via rejection sampling to obtain Bayesian estimates of difficulty and ability parameters under the Rasch model is presented. The Gibbs sampling method via rejection sampling was used in conjunction with the computer program OpenBUGS. Examples that compared the estimation method...
Article
This study investigates the effect of multidimensionality on extraction of latent classes in mixture Rasch models. In this study, two‐dimensional data were generated under varying conditions. The two‐dimensional data sets were analyzed with one‐ to five‐class mixture Rasch models. Results of the simulation study indicate the mixture Rasch model ten...
Article
Mixture item response theory (MixIRT) models can sometimes be used to model the heterogeneity among the individuals from different subpopulations, but these models do not account for the multilevel structure that is common in educational and psychological data. Multilevel extensions of the MixIRT models have been proposed to address this shortcomin...
Article
A brief review of various information criteria is presented for the detection of differential item functioning (DIF) under item response theory (IRT). An illustration of using information criteria for model selection as well as results with simulated data are presented and contrasted with the IRT likelihood ratio (LR) DIF detection method. Use of i...
Article
Much remains unclear about how chronic ankle instability (CAI) could affect knee muscle activations and interact with knee biomechanics. Therefore, the purpose of this study was to assess the influence of CAI on the lower extremity muscle activation at the ankle and knee joints during landings on a tilted surface. A surface electromyography system...
Article
Full-text available
The purpose of this study was to compare four popular value-added models used in measuring school effectiveness based on their distinguishing characteristics. In this study, the simple fixed effects model (SFEM), two hierarchical models (UHLMM and AHLMM), and the layered mixed effects model (LMEM) were analyzed using value-added measures obtained f...
Article
Background context: After spinal fusion, adolescent idiopathic scoliosis individuals (SF-AIS) often return to exercise and sport. However, the movements SF-AIS use to compensate for the loss of spinal flexibility during high-effort tasks are not known. Purpose: To compare, between SF-AIS and healthy controls (CON) groups, the spinal kinematics o...
Conference Paper
Hotelling’s canonical correlation is the Pearson product moment correlation between two weighted linear composites from two sets of variables. The two composites constitute a set of canonical variates, namely, a criterion variate and a predictor variate. Many statistical analyses in psychometrics deal with fallible data that contain measurement err...
Chapter
When you speak of having information, it implies that you know something about a particular object or topic. In statistics and psychometrics, the term information conveys a similar, but somewhat more technical, meaning. The statistical meaning of information is credited to Sir R.A. Fisher, who defined information as the reciprocal of the variance w...
Chapter
For didactic purposes, all of the preceding chapters have assumed that the metric of the ability scale was known. This metric had a midpoint of zero, a unit of measurement of 1, and a range from negative infinity to positive infinity. The numerical values of the item parameters and the examinee’s ability parameters have been expressed in this metri...
Chapter
In the first chapter the properties of the item characteristic curve were defined in terms of verbal descriptors. While this is useful to obtain an intuitive understanding of item characteristic curves, it lacks the precision and rigor needed by a theory. Consequently, in this chapter the reader will be introduced to three mathematical models for t...
Chapter
In many educational and psychological measurement situations there is an underlying variable of interest. This variable is often something that is intuitively understood, such as “intelligence.” People can be described as being bright or average and the listener has some idea as to what the speaker is conveying about the object of the discussion.
Chapter
Under item response theory , the primary purpose for administering a test to an examinee is to locate that person on the ability scale. If such an ability measure can be obtained for each person taking the test, two goals can be achieved. First, the examinee can be evaluated in terms of how much underlying ability he or she possesses. Second, compa...
Chapter
Item response theory is based upon the individual items of a test, and up to this point the chapters have dealt with the items one at a time. Now, attention will be given to dealing with all the items in a test at once. When scoring a test, the response made by an examinee to each item is dichotomously scored . A correct response is given a score o...
Chapter
Because the actual values of the parameters of the items in a test are unknown, one of the tasks performed when a test is analyzed under item response theory is to estimate these parameters. The obtained item parameter estimates then provide information as to the technical properties of the test items. To keep matters simple in the following presen...
Chapter
During this transitional period in testing practices, many tests have been designed and constructed using classical test theory principles but have been analyzed via item response theory procedures. This lack of congruence between the construction and analysis procedures has resulted in the full power of item response theory not being exploited. In...
Book
This graduate-level textbook is a tutorial for item response theory that covers both the basics of item response theory and the use of R for preparing graphical presentation in writings about the theory. Item response theory has become one of the most powerful tools used in test construction, yet one of the barriers to learning and applying it is t...
Article
Unidimensional, item response theory (IRT) models assume a single homogeneous population. Mixture IRT (MixIRT) models can be useful when subpopulations are suspected. The usual MixIRT model is typically estimated assuming a normally distributed latent ability. Research on normal finite mixture models suggests that latent classes potentially can be...
Chapter
In the continuation ratio model continuation ratio logits are used to model the probabilities of obtaining ordered categories in polytomously scored items. The continuation ratio model is an alternative to other models for ordered category items such as the graded response model, the generalized partial credit model, and the partial credit model. T...
Article
Full-text available
Background: Patellar tendinopathy (PT) is a common degenerative condition in physically active populations. Knowledge regarding the biomechanics of landing in populations with symptomatic PT is limited, but altered mechanics may play a role in the development or perpetuation of PT. Purpose: To identify whether study participants with PT exhibited d...
Chapter
This study was designed to provide an empirical comparison of three IRT calibration programs, IRTPRO, BILOG-MG, and IRTLRDIF, all of which can be used for detecting differential item functioning (DIF). The three programs were compared for each of three dichotomous IRT models, the one-parameter logistic, the two-parameter logistic, and the three-par...
Chapter
Full-text available
Although many theoretical papers on the estimation method of marginal maximum likelihood of item parameters for various models under item response theory mentioned Gauss–Hermite quadrature formulas, almost all computer programs that implemented marginal maximum likelihood estimation employed other numerical integration methods (e.g., Newton–Cotes f...
Article
The presence of nuisance dimensionality is a potential threat to the accuracy of results for tests calibrated using a measurement model such as a factor analytic model or an item response theory model. This article describes a mixture group bifactor model to account for the nuisance dimensionality due to a testlet structure as well as the dimension...
Article
Centenarians represent a rare but rapidly growing segment of the oldest-old. This study presents item-level data from the Mini-Mental State Examination (MMSE) in a cross-sectional, population-based sample of 244 centenarians and near-centenarians (aged 98-108, 16% men, 21% African-American, 38% community dwelling) from the Georgia Centenarian Study...
Chapter
Item response theory (IRT) models have been widely used for various educational and psychological testing purposes such as detecting differential item functioning (DIF), test construction, ability estimation, equating, and computer adaptive testing. The main assumption underlying these models is that local independence holds with respect to the lat...
Article
Markov chain Monte Carlo (MCMC) algorithms have been shown to be useful for estimation of complex item response theory (IRT) models. Although an MCMC algorithm can be very useful, it also requires care in use and interpretation of results. In particular, MCMC algorithms generally make extensive use of priors on model parameters. In this paper, MCMC...
Article
Full-text available
Centenarians display a broad variation in physical abilities, from independence to bed-bound immobility. This range of abilities makes it difficult to evaluate functioning using a single instrument. Using data from a population-based sample of 244 centenarians (M(Age) = 100.57 years, 84.8% women, 62.7% institutionalized, and 21.3% African American)...
Data
Full-text available
Supplementary material for this article describes analyses and procedures pertaining to: 1) handling of missing values regarding physical performance, 2) comparison of scoring cut-offs between EPESE and GCS samples, 3) physical performance scores by gender and age group, 4) distribution of extreme scores by physical performance measure, 5) characte...
Article
Full-text available
A latent transition analysis (LTA) model was described with a mixture Rasch model (MRM) as the measurement model. Unlike the LTA, which was developed with a latent class measurement model, the LTA-MRM permits within-class variability on the latent variable, making it more useful for measuring treatment effects within latent classes. A simulation st...
Article
Current methods for detecting growth of students' problem-solving skills in math focus mainly on analyzing changes in test scores. Score-level analysis, however, may fail to reflect subtle changes that might be evident at the item level. This article demonstrates a method for studying item-level changes using data from a multiwave experiment with a...
Article
This study examines model selection indices for use with dichotomous mixture item response theory (IRT) models. Five indices are considered: Akaike's information coefficient (AIC), Bayesian information coefficient (BIC), deviance information coefficient (DIC), pseudo-Bayes factor (PsBF), and posterior predictive model checks (PPMC). The five indice...
Article
Data from a large-scale performance assessment (N= 105,731) were analyzed with five differential item functioning (DIF) detection methods for polytomous items to examine the congruence among the DIF detection methods. Two different versions of the item response theory (IRT) model-based likelihood ratio test, the logistic regression likelihood ratio...
Article
The procedures required to obtain the approximate posterior standard deviations of the parameters in the three commonly used item response models for dichotomous items are described and used to generate values for some common situations. The results were compared with those obtained from maximum likelihood estimation. It is shown that the use of pr...
Article
The computer program LDIP provides indices of local dependence for polytomous items under item response theory (cf. Chen, 1998). The indices are the Pearson chi-square statistic Χ² (Agresti, 1996), the likelihood ratio chi-square statistic G² (Agresti, 1996), Yen's (1993) index of local dependence Q³, and the Fisher-transformed correlation differen...
Article
Full-text available
We examine the performance of the GRM applied to idealized polytomous questionnaire data under conditions of varying scale length, sample size, and distribution form. Comparisons with previous work on dichotomous data are drawn. The findings should help guide the study of differential item functioning and measurement equivalence. Item response theo...
Article
Detection of differential item functioning (DIF) on items intentionally constructed to favor one group over another was investigated on item parameter estimates obtained from two item response theory-based computer programs, LOGIST and BILOG. Signed- and unsigned-area measures based on joint maximum likelihood estimation, marginal maximum likelihoo...
Article
Full-text available
The purpose of this study was to determine the psychometric characteristics of a phonological awareness assessment for prekindergarten children using Messick's (1989) framework for unitary construct validity. Upon entry into prekindergarten, children were given rhyme discrimination, syllable segmentation, initial phoneme isolation, and phoneme blen...
Chapter
The accuracy of Bayesian procedures was considered for estimation of parameters in the two-parameter logistic item response theory model. For the example data, all estimation procedures yielded comparable item and ability estimates. Simulated data sets were also analyzed using six estimation procedures for item parameters. Hierarchical Bayes estima...
Article
This paper is concerned with statistical issues in differential item functioning (DIF). Four subsets of large scale performance assessment data from the Georgia Kindergarten Assessment Program-Revised (N=105,731; N=10,000; N=1,00; and N=100) were analyzed using three DIF detection methods for polytomous items to examine the congruence among the DIF...
Article
The number of replications in monte carlo simulation studies can be modified to improve the precision of parameter estimates. Given the speed and power of microcomputers, it is not necessary to hold the number of replications to past levels. Reasons why increasing the number of replications is not necessary for satisfactory levels of precision are...
Article
The accuracy of the Gibbs sampling Markov chain monte carlo procedure was examined for estimating item and person (.) parameters in the one-parameter logistic model. Four datasets were analyzed using the Gibbs sampling method, conditional maximum likelihood, marginal maximum likelihood, and joint maximum likelihood. Maximum likelihood and expected...
Article
The accuracy of the Markov Chain Monte Carlo (MCMC) procedure Gibbs sampling was considered for estimation of item parameters of the two-parameter logistic model. Data for the Law School Admission Test (LSAT) Section 6 were analyzed to illustrate the MCMC procedure. In addition, simulated data sets were analyzed using the MCMC, marginal Bayesian es...
Article
This paper provides a review of procedures for detection of differential item functioning (DIF) for item response theory (IRT) and observed score methods for the graded response model. In addition, data from a test anxiety scale were analyzed to examine the congruence among these procedures. Data from Nasser, Takahashi, and Benson (1997) were reana...
Article
The accuracy of Gibbs sampling, a Markov chain Monte Carlo procedure, was considered for estimation of item and ability parameters under the two-parameter logistic model. Memory test data were analyzed to illustrate the Gibbs sampling procedure. Simulated data sets were analyzed using Gibbs sampling and the marginal Bayesian method. The marginal Ba...
Article
Type I error rates of the likelihood ratio test for the detection of differential item functioning (DIF) in the partial credit model were investigated using simulated data. The partial credit model with four ordered performance levels was used to generate data sets of a 30-item test for samples of 300 and 1,000 simulated examinees. Three different...
Article
The Behrens-Fisher problem arises when one seeks to make inferences about the means of two normal populations without assuming the variances are equal. This paper presents a review of fundamental concepts and applications used to address the Behrens-Fisher problem under fiducial, Bayesian, and frequentist approaches. Methods of approximations to th...
Article
Linking tests from different calibrations under item response theory requires calculation of the slope and intercept of the appropriate linear transformation. Results from five linking methods under the graded response model were investigated using simulated datasets: the characteristic curve method, the minimum x2 method, and three mean and a meth...
Article
Applications of item response theory (IRT) to practical testing problems, including equating, differential item functioning, and computerized adaptive testing, require a common metric for item parameter estimates. This study compared three methods for developing a common metric under IRT: (1) linking separate calibration runs using equating coeffic...
Article
Type I error rates of the likelihood ratio test for the detection of differential item functioning (DIF) were investigated using Monte Carlo simulations. The graded response model with five ordered categories was used to generate data sets of a 30-item test for samples of 300 and 1,000 simulated examinees. All DIF comparisons were simulated by rand...
Article
Applications of item response theory to practical testing problems including equating, differential item functioning, and computerized adaptive testing, require that item parameter estimates be placed onto a common metric. In this study, two methods for developing a common metric for the graded response model under item response theory were compare...
Article
The Behrens-Fisher distribution arises when one seeks to make inferences about the means of two normal populations withoutassuming the variances are equal. This paper presents tables of percentage points of the Behrens-Fisher distribution for tests of significance.
Article
Type I error rates for the likelihood ratio test for de tecting differential item functioning (DIF) were investi gated using monte carlo simulations. Two- and three-parameter item response theory (IRT) models were used to generate 100 datasets of a 50-item test for samples of 250 and 1,000 simulated examinees for each IRT model. Item parameters wer...
Article
This study compares three procedures for the detection of differential item functioning (DIF) under item response theory (IRT): (a) Lord's chi-square, (b) Raju's area measures, and (c) the likelihood ratio test. Relations among the three procedures and some practical considerations, such as linking metrics and scale purification, are discussed. Dat...
Article
Detection of differential item functioning (DIF) is most often done between two groups of examinees under item response theory. It is sometimes important, however, to determine whether DIF is present in more than two groups. In this article we present a method for detection of DIF in multiple groups. The method is closely related to Lard's chi-squa...
Article
Full-text available
The minimum x2 method for computing equating coefficients for tests with dichotomously scored items was extended to the case of Samejima's graded response items. The minimum X2 method was compared with the test response function method (also referred to as the test characteristic curve method) in which the equating coefficients were obtained by mat...
Article
Type I error rates of Lord's χ 2 test for differential item functioning were investigated using monte carlo simulations. Two- and three-parameter item response theory (IRT) models were used to generate 50-item tests for samples of 250 and 1,000 simulated examin ees. Item parameters were estimated using two algo rithms (marginal maximum likelihood e...
Article
Hierarchical Bayes procedures for the two-parameter logistic item response model were compared for estimating item and ability parameters. Simulated data sets were analyzed via two joint and two marginal Bayesian estimation procedures. The marginal Bayesian estimation procedures yielded consistently smaller root mean square differences than the joi...
Article
Full-text available
Methods for detecting differential item func tioning (DIF) have been proposed primarily for the item response theory dichotomous response model. Three measures of DIF for the dichotomous response model are extended to include Samejima's graded response model: two measures based on area differences between item true score functions, and a χ2 statist...
Article
The area between item response functions esti mated in different samples is often used as a measure of differential item functioning (DIF). Under item response theory, this area should be 0, except for errors of measurement. This study examined the effectiveness of two statistical tests of this area—a Z test for exact signed area and a Z test for e...
Article
Calculator effects were examined using methods taken from research on differential item functioning. Use of a calculator was controlled on two experimental forms of a test assembled from operational items used on a standardized university mathematics placement test. Results indicated that calculator effects were not present based on analysis of tes...
Article
Full-text available
Studies of differential item functioning under item response theory require that item parameter estimates be placed on the same metric before comparisons can be made. The present study compared the effects of three methods for linking metrics: a weighted mean and sigma method (WMS); the test characteristic curve method (TCC); and the minimum chi-sq...
Article
Full-text available
The area between two item response functions is often used as a measure of differential item functioning under item response theory. This area can be measured over either an open interval (i.e., exact) or closed interval. Formulas are presented for com puting the closed-interval signed and unsigned areas. Exact and closed-interval measures were est...

Network

Cited By