Page 1

JSS

Journal of Statistical Software

November 2006, Volume 17, Issue 5.http://www.jstatsoft.org/

ltm: An R Package for Latent Variable Modeling

and Item Response Theory Analyses

Dimitris Rizopoulos

Catholic University of Leuven

Abstract

The R package ltm has been developed for the analysis of multivariate dichotomous

and polytomous data using latent variable models, under the Item Response Theory ap-

proach. For dichotomous data the Rasch, the Two-Parameter Logistic, and Birnbaum’s

Three-Parameter models have been implemented, whereas for polytomous data Seme-

jima’s Graded Response model is available.

marginal maximum likelihood using the Gauss-Hermite quadrature rule. The capabilities

and features of the package are illustrated using two real data examples.

Parameter estimates are obtained under

Keywords: latent variable models, item response theory, Rasch model, two-parameter logistic

model, three-parameter model, graded response model.

1. Introduction

Latent variable models (Bartholomew and Knott 1999; Skrondal and Rabe-Hesketh 2004)

constitute a general class of models suitable for the analysis of multivariate data. In principle,

latent variable models are multivariate regression models that link continuous or categorical

responses to unobserved covariates. The basic assumptions and objectives of latent variable

modeling can be summarized as follows (Bartholomew, Steele, Moustaki, and Galbraith 2002):

• A small set of latent variables is assumed to explain the interrelationships in a set of

observed response variables. This is known as the conditional independence assumption,

which postulates that the response variables are independent given the latent variables.

This simplifies the estimation procedure, since the likelihood contribution of the mul-

tivariate responses is decomposed into a product of independent terms. In addition,

exploring conditional independence may help researchers first in drawing conclusions in

complex situations, and second in summarizing the information from observed variables

in few dimensions (reduction of dimensionality).

Page 2

2

ltm: Latent Variable Modeling and Item Response Theory Analyses in R

• Unobserved variables such as intelligence, mathematical or verbal ability, racial prej-

udice, political attitude, consumer preferences, which cannot be measured by conven-

tional means, can be quantified by assuming latent variables. This is an attractive

feature that has applications in areas such as educational testing, psychology, sociology,

and marketing, in which such constructs play a very important role.

• Latent variable modeling is also used to assign scores to sample units in the latent

dimensions based on their responses. This score (also known as a ‘Factor Score’) is

a numerical value that indicates a person’s relative spacing or standing on a latent

variable. Factor scores may be used either to classify subjects or in the place of the

original variables in a regression analysis, provided that the meaningful variation in the

original data has not been lost.

Item Response Theory (IRT) (Baker and Kim 2004; van der Linden and Hambleton 1997)

considers a class of latent variable models that link mainly dichotomous and polytomous

manifest (i.e., response) variables to a single latent variable. The main applications of IRT can

be found in educational testing in which analysts are interested in measuring examinees’ ability

using a test that consists of several items (i.e., questions). Several models and estimation

procedures have been proposed that deal with various aspects of educational testing.

The aim of this paper is to present the R (R Development Core Team 2006) package ltm,

available from CRAN (http://CRAN.R-project.org/), which can be used to fit a set of latent

variable models under the IRT approach. The main focus of the package is on dichotomous

and polytomous response data. For Gaussian manifest variables the function factanal() of

package stats can be used.

The paper is organized as follows. Section 2 briefly reviews the latent variable models for

dichotomous and polytomous data. In Section 3 the use of the main functions and methods

of ltm is illustrated using two real examples. Finally, in Section 4 we describe some extra

features of ltm and refer to future extensions.

2. Latent variable models formulation

The basic idea of latent variable analysis is to find, for a given set of response variables

x1,...,xp, a set of latent variables z1,...,zq (with q ? p) that contains essentially the

same information about dependence. The latent variable regression models have usually the

following form

E(xi| z)=

g(λi0+ λi1z1+ ··· + λiqzq)(i = 1,...,p),

(1)

where g(·) is a link function, λi0,...,λiq are the regression coefficients for the ith manifest

variable, and xiis independent of xj, for i ?= j, given z = {z1,...,zq}. The common factor

analysis model assumes that the xi’s are continuous random variables following a Normal

distribution with g(·) being the identity link. In this paper we focus on IRT models, and

consider mainly dichotomous and polytomous items, in which E(xi| z) expresses the prob-

ability of endorsing one of the possible response categories. In the IRT framework usually

one latent variable is assumed, but for models on dichotomous responses the inclusion of two

latent variables is briefly discussed in Section 4.

Page 3

Journal of Statistical Software

3

2.1. Models for dichotomous data

The basic ingredient of the IRT modeling for dichotomous data is the model for the probability

of positive (or correct) response in each item given the ability level z. A general model for

this probability for the mth examinee in the ith item is the following

P(xim= 1 | zm)=

ci+ (1 − ci)g{αi(zm− βi)},

(2)

where ximis the dichotomous manifest variable, zmdenotes the examinee’s level on the latent

scale, ci is the guessing parameter, αi the discrimination parameter and βi the difficulty

parameter. The guessing parameter expresses the probability that an examinee with very low

ability responds correctly to an item by chance. The discrimination parameter quantifies how

well the item distinguishes between subjects with low/high standing in the latent scale, and

the difficulty parameter expresses the difficulty level of the item.

The one-parameter logistic model, also known as the Rasch model (Rasch 1960), assumes

that there is no guessing parameter, i.e., ci= 0 and that the discrimination parameter equals

one, i.e., αi = 1, ∀i. The two-parameter logistic model allows for different discrimination

parameters per item and assumes that ci= 0. Finally, Birnbaum’s three-parameter model

(Birnbaum 1968) estimates all three parameters per item.

The two most common choices for g(·) are the probit and the logit link, which correspond

to the cumulative distribution function (cdf) of the normal and logistic distributions, respec-

tively. The functions included in ltm fit (2) under the logit link. Approximate results under

the probit link for the one- and two-parameter logistic models can be obtained using the

relation

α(p)

≈

where α(p)

i

are the discrimination parameters under the probit and logit link, respectively,

and β(p)

i

are defined analogously. The scaling constant 1.702 is chosen such that the

absolute difference between the normal and logistic cdf is less than 0.01 over the real line.

i(zm− β(p)

i)1.702α(l)

i(zm− β(l)

i),

(3)

i, α(l)

i, β(l)

2.2. Models for polytomous ordinal data

Analysis of polytomous manifest variables is currently handled by ltm using the Graded

Response Model (GRM). The GRM was first introduced by Samejima (1969), and postulates

that the probability of the mth subject to endorse the kth response for the ith item is expressed

as

P(xim= k | zm)=

g(ηik) − g(ηi,k+1),

(4)

ηik

=

αi(zm− βik),k = 1,...,Ki,

where xim is the ordinal manifest variable with Ki possible response categories, zm is the

standing of the mth subject in the latent trait continuum, αi denotes the discrimination

parameter, and βik’s are the extremity parameters with βi1< ... < βik< ... < βi,Ki−1and

βiKi= ∞. The interpretation of αiis essentially the same as in the models for dichotomous

data. However, in GRM the βik’s represent the cut-off points in the cumulative probabilities

scale and thus their interpretation is not direct. ltm fits the GRM under the logit link.

Page 4

4

ltm: Latent Variable Modeling and Item Response Theory Analyses in R

There have been proposed several alternatives to the GRM for the analysis of polytomously

scored items. Two of them that are frequently applied are the Partial Credit and the Rating

Scale models. The partial credit model is more suitable in cases where the difference between

response options is identical for different items in the attitude scale, whereas the rating scale

model is applicable to a test in which all items have the same number of categories. We refer

to van der Linden and Hambleton (1997) and Zickar (2002) for additional information and

discussion about the polytomous models.

2.3. Implementation in ltm

Estimation of model parameters has received a lot of attention in the IRT literature. Under

Maximum Likelihood there have been developed three major methods, namely conditional,

full, and marginal maximum likelihood. A detailed overview of these methods is presented

in Baker and Kim (2004) and a brief discussion about the relative merits of each method

can be found in Agresti (2002, Section 12.1.5). In addition, parameter and ability estimation

under a Bayesian approach is reviewed in Baker and Kim (2004). Package ltm fits the models

presented in Sections 2.1 and 2.2 using Marginal Maximum Likelihood Estimation (MMLE).

Conditional maximum likelihood estimation has been recently implemented in package eRm

(Mair and Hatzinger 2006) but only for some Rasch type models, and Markov Chain Monte

Carlo for the one and k dimensional latent trait models are available from the MCMCpack

package (Martin and Quinn 2006).

Parameter estimation under MMLE assumes that the respondents represent a random sample

from a population and their ability is distributed according to a distribution function F(z).

The model parameters are estimated by maximizing the observed data log-likelihood obtained

by integrating out the latent variables; the contribution of the mth sample unit is

?m(θ)=logp(xm;θ) = log

?

p(xm|zm;θ) p(zm) dzm,

(5)

where p(·) denotes a probability density function, xmdenotes the vector of responses for the

mth sample unit, zmis assumed to follow a standard normal distribution and θ = (αi,βi).

Package ltm contains four model fitting functions, namely rasch(), ltm(), tpm() and grm()

for fitting the Rasch model, the latent trait model, the three-parameter model, and the graded

response model, respectively. The latent trait model is a general latent variable model for

dichotomous data of the form (1), including as a special case the two-parameter logistic model.

The integral in (5) is approximated using the Gauss-Hermite quadrature rule. By default,

in rasch(), tpm() and grm() 21 quadrature points are used, whereas ltm() uses 21 points

when one latent variable is specified and 15 otherwise. It is known (Pinheiro and Bates 1995)

that the number of quadrature points used may influence the parameter estimates, standard

errors and log-likelihood value, especially for the case of two latent variables and nonlinear

terms as described in Section 4. Thus, it is advisable to investigate its effect by fitting the

model with an increasing number of quadrature points. However, for the unidimensional (i.e.,

one latent variable) IRT models considered so far, the default number of points will be, in the

majority of the cases, sufficient.

Maximization of the integrated log-likelihood (5) with respect to θ for rasch(), tpm() and

grm() is achieved using optim()’s BFGS algorithm. For ltm() a hybrid algorithm is adopted,

in which a number of EM iterations is initially used, followed by BFGS iterations until con-

vergence. In addition, for all four functions, the optimization procedure works under an

Page 5

Journal of Statistical Software

5

additive parameterization as in (1), i.e., λi0+λi1zm; however, the parameter estimates for the

Rasch, the two-parameter logistic, the three-parameter, and the graded response models are

returned, by default, under parameterizations (2) and (4). This feature is controlled by the

IRT.param argument. Starting values are obtained either by fitting univariate GLMs to the

observed data with random or deterministic z values, or they can be explicitly set using the

start.val argument. The option of random starting values (i.e., use of random z values in the

univariate GLM) might be useful for revealing potential local maxima issues. By default all

functions use deterministic starting values (i.e., use of deterministic z values in the univariate

GLM). Furthermore, all four functions have a control argument that can be used to specify

various control values, such as the optimization method in optim() (for tpm() the nlminb()

optimizer is also available) and the corresponding maximum number of iterations, and the

number of quadrature points, among others. Finally, the four fitting functions return objects

of class named after the corresponding (model fitting) function (i.e., rasch() returns rasch

objects, etc.), for which the following methods are available: print(), coef(), summary(),

plot(), fitted(), vcov(), logLik(), anova(), margins() and factor.scores(); the last

two generic functions are defined in ltm and their use is illustrated in more detail in the

following section.

3. Package ltm in use

We shall demonstrate the use of ltm in two data sets; the first one concerns binary data where

rasch(), ltm() and tpm(), and their methods are investigated, while for the second one that

deals with ordinal data, grm() and its methods are illustrated. For both examples the results

are presented under the default number of quadrature points. To investigate sensitivity we

have also fitted the models with 61 points and essentially the same results have been obtained.

3.1. An example with binary data

In this section we consider data from the Law School Admission Test (LSAT) that has been

taken by 1000 individuals responding to five questions.

educational test data-set presented also in Bock and Lieberman (1970).

available in ltm as the data.frame LSAT.

At an initial step, descriptive statistics for LSAT are produced using the descript() function:

This is a typical example of an

LSAT data are

R> descript(LSAT)

Descriptive statistics for ’LSAT’ data-set

Sample:

5 items and 1000 sample units; 0 missing values

Proportions for each level of response:

Item 1 Item 2 Item 3 Item 4 Item 5

0 0.076 0.291

1 0.9240.709

logit2.4980.891

0.447

0.553

0.213

0.237

0.763

1.169

0.130

0.870

1.901