Validation of Single-Factor Structure and
Scoring Protocol for the Health Assessment
JASON C. COLE, SAROSH J. MOTIVALA, DINESH KHANNA, JESSICA Y. LEE,
HAROLD E. PAULUS, AND MICHAEL R. IRWIN
Objective. The extensively used Health Assessment Questionnaire Disability Index (HAQ-DI) has been well received by
the research and clinical community, notably because of its measurement strengths including reliability and stability of
scores over time, utility in observational studies and clinical trials, predictive relationship with morbidity and mortality
in rheumatoid arthritis (RA), and its translation for use in different countries. However, HAQ-DI scoring has not been
validated. The purpose of this study was to examine the structural validity of the HAQ-DI and evaluate the latent factors
underlying HAQ-DI scoring.
Methods. This study used a cross-validation approach on a total of 278 patients with RA. Exploratory and confirmatory
factor analyses were performed.
Results. Results yielded a single-factor HAQ-DI score, which favored the current scoring system of the HAQ-DI.
Additionally, modification indices suggested improved model fit with the secondary inclusion of correlated residual
scores from a motor skills subdomain.
Conclusion. The current study provides the first validation of the HAQ-DI scoring system as determined by its latent
factor structure. In addition, the findings suggest some benefit from a secondary interpretation of the scores based on
domains that measure motor skills.
KEY WORDS. Rheumatoid arthritis; Latent analysis; Confirmatory factor analysis; HAQ-DI.
Rheumatoid arthritis (RA) is a chronic, systemic, inflam-
matory disorder of unknown etiology that primarily in-
volves the joints. It may be remitting, but if uncontrolled,
may lead to deformity and destruction of joints due to the
erosion of cartilage and bone. This symmetrical disease
often progresses from peripheral to more proximal joints
and, in many patients, results in significant functional
disability. This disability can lead to difficulties in per-
forming simple physical activity and everyday tasks such
as cleaning, cooking, and dressing. Patient outcomes have
not been fully explained by laboratory or radiographic
measures, and as such, disability assessment offers an
important component of disease activity characterization.
The Health Assessment Questionnaire-Disability Index
(HAQ-DI) published in 1980 by Fries et al (1) has been
used extensively in the evaluation of disease-specific dis-
ability or quality of life (QOL) related to RA in the United
States and other countries (2–4). Both observational stud-
ies (2–4) and clinical trials (5–7) have used the HAQ-DI
and found its scores to be an important predictor of work
disability (8), morbidity (8,9), and mortality (10). Addi-
tionally, a recent study (11) has shown that the modified
HAQ (12) correlated with a latent construct of physical
disability at 0.87. Latent constructs represent constructs of
interest that can not be measured directly (e.g., measure-
ment of one’s preference).
Notwithstanding the widespread use of the HAQ-DI and
its appropriate reliability and convergent validity, empir-
ical support for the factor structure and scoring system of
the HAQ-DI is limited. In other words, the structural va-
Supported in part by grants from the National Institutes of
Health MH55253, T32-MH18399, AG18367, AT00255, AR/
AG41867, AR049840, and M01 RR00827. Dr. Khanna’s
work was supported in part by the Arthritis and Sclero-
MD, MS, Jessica Y. Lee, Harold E. Paulus, MD, Michael R.
Irwin, MD: University of California, Los Angeles.
Address correspondence to Michael R. Irwin, MD, Cous-
ins Center for Psychoneuroimmunology, UCLA Neuropsy-
chiatric Institute, 300 UCLA Medical Plaza, Room 3109, Los
Angeles, CA 90095-7076. E-mail: firstname.lastname@example.org.
Submitted for publication November 9, 2004; accepted in
revised form March 15, 2005.
Arthritis & Rheumatism (Arthritis Care & Research)
Vol. 53, No. 4, August 15, 2005, pp 536–542
© 2005, American College of Rheumatology
lidity of the HAQ-DI has not been adequately assessed;
structural validation is important so that one can under-
stand how to score and interpret the HAQ-DI (13). For
example, the HAQ currently uses a single total score, yet it
is not known whether the interrelationship among do-
mains that comprise the HAQ support the use of a single
score or whether multiple scores should be obtained. Da-
leo et al (14) and Kaufman (15) have noted that the scoring
system of a measure should reflect its latent structure: if a
measure has 3 factors, 3 scores should be calculated and
interpreted. For example, the popular Center for Epidemi-
ologic Studies Depression Scale (CES-D) has long been
viewed as having 4 factors, but only 1 score is calculated
(16), yielding what Daleo et al and Kaufman would deem
as an inappropriate scoring system given the factor struc-
ture. Cole et al (17) demonstrated that the CES-D has a
single hierarchical factor that subsumed the 4 factors in
the CES-D, and therefore provided the first empirical evi-
dence that the CES-D should be interpreted as a single
score based on the refined factor structure.
Only 1 study detailing a factor analysis of the HAQ-DI
was found (18) during an exhaustive review of the pub-
lished literature using PubMed, PsychInfo, and Social Sci-
ence Citation Index. Although Daltroy et al (18) found that
a single dominant factor comprised the factor structure of
the HAQ-DI, their results were generated using only ex-
ploratory factor analysis (EFA) even though confirmatory
factor analysis (CFA) is now regarded as essential after the
initial development of a measure (19). The use of CFA tests
the viability and stability of the underlying construct(s)
being evaluated (20,21). Indeed, CFA should be used as
part of the process when determining the structural valid-
ity of any previously validated measure (21), and opti-
mally EFA and CFA can be integrated using a cross-vali-
dation strategy (20).
The goal of the current study was to examine the struc-
tural validity of the HAQ-DI using a cross-validation ap-
proach with EFA and CFA. CFA was used to compare the
structure obtained through EFA with other logical struc-
tures for the HAQ-DI. The results were used to guide and
clarify scoring and interpretation procedures for HAQ-DI
SUBJECTS AND METHODS
Subjects. Subjects were a subset of individuals with RA
participating in a longitudinal study involving the West-
ern Consortium of Practicing Rheumatologists, which is a
regional consortium of 29 rheumatology practices in the
western United States and Mexico as described in previ-
ous studies (22,23). The inclusion criteria for this study
included a diagnosis of RA as defined by the American
College of Rheumatology (formerly the American Rheuma-
tism Association) criteria (24) within 15 months of symp-
tom onset, no previous disease-modifying antirheumatic
drug treatment, rheumatoid factor seropositive (RF titer
?1:80 or ?40 IU/ml), ?6 swollen joints, and ?9 tender
joints. Symptom onset was defined as the date when mus-
culoskeletal symptoms began, provided that these symp-
toms persisted and led to the diagnosis of RA. This study
was approved by the appropriate institution review
The consortium rheumatologists assessed patient dis-
ease status at study entry (baseline), 6 months, 1 year, and
yearly thereafter. Using standard methods, detailed physi-
cian assessment included all of the core set outcome mea-
sures required to calculate the disease activity score
(DAS), including complete tender and swollen joint
counts and acute phase reactant measures, as well as
0–100-mm visual analog scales for global and pain assess-
ments. The DAS was calculated according to the published
algorithm using the Ritchie index, swollen joint count of
44 joints, and Westergren erythrocyte sedimentation rate
(ESR) in mm/hour (25). In addition, study visits included
radiographs of the hands, wrists, and forefeet; assays for
RF; and self-report measurements such as the HAQ-DI and
the CES-D (26). At each scheduled physician visit, blood
specimens were collected to determine C-reactive protein
levels; ESR was determined, when clinically indicated, in
the rheumatologist’s office or local laboratory.
Measures. The HAQ-DI is a condition-specific measure
of functional status or QOL (measuring activities of daily
living) intended for use in arthritis (1). The original
HAQ-DI was designed as a 20-item self-administered ques-
tionnaire that examined difficulties with the performance
of activities of daily living on a 0–3 scale in 8 domains
(dressing and grooming, arising, eating, walking, hygiene,
reach, grip, and other activities). A grade 3 of difficulty is
assigned to patients using assistive/adaptive devices (such
as canes, walker). The HAQ-DI score is calculated by sum-
marizing the highest score in each of the 8 domains and
dividing the sum by 8, resulting in a score range of 0 (no
disability) to 3 (severe disability) on an ordinal scale.
checked using SPSS version 11.5 (SPSS, Chicago, IL) by
research assistants with ample data entry experience.
HAQ-DI scores were obtained per the instructions of Bruce
and Fries (27). A single-extraction variant of the multiple
imputation procedure for missing data replacement (28)
was conducted for the missing points using NORM soft-
ware (29). Multiple imputation uses a regression-type ap-
proach to estimate each missing datum. Imputed values
are generated taking into account responses from the same
participant on other correlated variables and responses to
the same domain from participants who responded simi-
larly. Using such multiple imputation formulae, Rubin
and Schenker (30) have demonstrated that single imputa-
tion yields virtually identical results to that of the more
laborious multiple database process. HAQ-DI domain de-
scriptive statistics are listed in Table 1.
Because the relationship between many health-outcome
variables is typically nonnormal (17,31), adjustments need
to be made to control for nonnormality for any latent
analysis using maximum likelihood estimation (MLE).
Bootstrapping was used during model estimation to con-
trol for multivariate nonnormality (32,33). The process of
analysis. Data wereentered andcross
HAQ-DI Factor and Scoring Validation 537
bootstrapping takes multiple random subsamples from the
current sample to smooth over any inaccuracies in the
estimates of model fit due to nonnormality.
Exploratory factor analysis. One randomly divided sub-
sample (n ? 134) of the total sample was analyzed with
EFA using SPSS version 12.0 (SPSS). Principal compo-
nents analysis was used to determine the number of factors
to retain for the EFA, per the recommendations of Preacher
and MacCallum (19). In doing so, we examined the scree
plot (a plot of eigenvalues, or the strength of a factor, to the
number of factors – when the plot line becomes flat, factors
to the right are considered useless) along with the Kaiser-
Guttman criterion (eigenvalues ?1.0 should be kept; see
reference 19). Subsequently, EFA was carried out using
MLE extraction factor analysis with direct oblimin rotation
(a type of oblique rotation), as suggested by Preacher and
MacCallum (19). The EFA analyses generate factor load-
ings, which are measures of how strongly the observed
variables in the HAQ-DI are associated with its latent
factor(s). Factor loadings for each domain were compared
with criteria established by Comrey and Lee (34): values
?0.71 signify excellent loadings, 0.63–0.70 are very good,
0.55–0.62 are good, 0.45–0.54 are fair, 0.32–0.44 are
deemed poor, and any values ?0.32 are discarded.
Confirmatory factor analysis. Once the EFA was com-
pleted, a CFA was undertaken in the second subsample
(n ? 144) to test the stability and replicability of the latent
model produced by the EFA (Figure 1). Therein, the rect-
angular blocks represent HAQ-DI domains with circles to
their left that represent each domain’s residual (i.e., any-
thing not measured by the relationship between the
HAQ-DI domain and the latent variable). The circular fig-
ure to the right of the domains represents the overall
HAQ-DI latent variable of disease impact.
CFA was performed using the AMOS statistical software
package (35). MLE extraction was used to estimate the CFA
model. The purpose of the CFA was to determine whether
the EFA-derived model provided sufficient goodness-of-fit
with the data in the second subsample, thus providing
evidence for the stability of the model (e.g., how closely
the model’s purported covariance matrix fits with the ac-
tual covariance matrix of the subsample). Schumacker and
Lomax (13) suggest that it is best to review multiple mea-
sures of model-data fit to examine the model from various
perspectives. Therefore, in the current study, 4 fit indexes
were used: Goodness-of-Fit (GFI), Adjusted Goodness-of-
Fit (AGFI), Comparative Fit Index (CFI), and Root Mean
Squared Error of Approximation (RMSEA). GFI and AGFI
were evaluated with a minimum criterion of 0.90 (36), and
CFI should be no less than 0.95 (37). RMSEA yields both a
score and a 90% confidence interval; good fit would be
indicated when the scores at the lower bound are ?0.06
(38). GFI and AGFI are used to estimate strengths of asso-
ciation; GFI measures the association between the model
and data, whereas AGFI adjusts GFI by taking into account
the degrees of freedom (df) in a model (GFI can be inflated
by high df). CFI and RMSEA provide estimates of Type I
and Type II error, respectively. CFI is a measure of Type I
error in that it specifies the amount of difference between
the examined model and the independence model (i.e., a
standard comparison model that asserts none of the com-
ponents in the model are related), with higher scores in-
dicating larger differences; RMSEA is complimentary to
CFI because it is a measure of Type II error, determining
the difference between the examined model and the satu-
rated model (i.e., another standard model that asserts each
of the components in the model are related to all other
components in the model), with lower scores indicating
Table 1. Health Assessment Questionnaire domain correlations and descriptive statistics*
1. Dressing and grooming
Mean ? SD score†
1.00 ? 0.76 0.98 ? 0.76 1.07 ? 0.94 0.89 ? 0.83 1.17 ? 1.01 1.28 ? 0.99 1.05 ? 0.85 1.22 ? 0.88
* Correlations provided for descriptive purposes and were not analyzed for significance. All data were based upon multiple imputation data
† Score range for all domains was 0–3. Mean ? SD score for all 278 participants was 1.17 ? 0.70.
Figure 1. Health Assessment Questionnaire single-factor model
from the confirmatory factor analysis.
538 Cole et al
greater differences. Ideally, the examined model should be
markedly different from the independence model and the
saturated model. If each of these 4 fit indices meet or
surpass these thresholds, then the model can be consid-
Model refinement. Often a model’s fit indices may come
close to reaching the abovementioned thresholds, but not
close enough to be considered satisfactory. In such a case,
minor adjustments to the relationships in the model can be
made and the model can then be retested. The determina-
tion of which adjustments to make can be guided by using
modification indices, which provide an estimate of the
improvement in model fit that will occur by adding a given
relationship, including direct paths and correlations (13).
A standard approach of using a modification index of at
least 10.0 was used; paths with a modification index ?10
were considered to be too weak to provide substantive
benefit. Modification of the model after an initial analysis
will only be conducted if the modification meets statistical
criteria and fits with the theoretical understanding of the
HAQ-DI (13). When modifications are added to a model,
the model will be rerun and interpreted with the new fit
A total of 315 participants were admitted into the study, of
which 27 participants (12%) were missing responses on 3
or more domains on the HAQ-DI. These subjects were
removed from the database to allow for proper use of
missing data replacement techniques (40). No more than
20% of missing data for any domain were found after
removal of the 27 participants. According to Schafer and
Graham (28), data should be missing at random to use
missing data replacement appropriately. The presence of
random or nonrandom missing data can be ascertained by
examining the patterns of missing data to ensure that no
one pattern (or patterns) is particularly likely over other
patterns of missing data. A review of the current data
found no such patterns, suggesting that these data are
accurately described as missing at random.
The final sample comprised 278 participants with a
mean ? SD age of 51 ? 13 years, a mean disease duration
of 8.7 ? 10 months, and a mean HAQ-DI score of 1.17 ?
0.70. This sample size provided extensive power for the
planned analyses (38). The descriptive statistics for the
HAQ-DI domain scores are provided in Table 1, including
correlations among the HAQ-DI domains as well as the
mean ? SD and range of scores for each domain. Correla-
tions among all of the domains were large (according to the
criteria from Cohen ), ranging from the mid 0.50s to the
low 0.70s. Each of the 8 HAQ-DI domain scores range from
0 to 3, with the means and SDs near 1.0 for most scales.
Furthermore, HAQ-DI total scores ranged from 0 to 3 with
a mean ? SD of 1.17 ? 0.70.
To provide an exploratory analysis of the HAQ-DI latent
structure, an EFA was run on a randomly assigned sample.
Results of the EFA are displayed in Table 2, where each
domain is given a loading and a communality value. The
loading refers to the correlation of a domain with the
obtained latent factor, and the communality is the shared
variance between the domain and the factor (i.e., the
square of the loading). In other words, a high loading and
communality mean that the domain has a strong relation-
ship with the latent factor. Additionally, Table 2 shows
that 68.4% of the variance was accounted for by a single
factor, suggesting that this single dominant factor alone
comprised the latent structure of the HAQ-DI. The single-
factor structure was favored over the next most-viable
model, a 2-factor structure, because examination of the
eigenvalues showed a sharp decrease from 5.47 (68.38%
variance explained) for the single factor to 0.61 (7.62%
variance explained) for the 2-factor model. Additionally,
inspection of the scree plot revealed that the scree was
obtained at 2 factors (indicating a single-factor structure).
All 8 HAQ domains had excellent loadings (ranging from
0.74 to 0.87).
Based on the single-factor solution of the EFA, a CFA
was run solely on this single-factor model using the other
random half of the sample. In this CFA, the single-factor
model was close but did not meet adequate fit criteria
(GFI ? 0.89, AGFI ? 0.80, CFI ? 0.93, RMSEA ? 0.13).
Whereas CFI was nearly acceptable, RMSEA was not.
These results indicate that the latent structure was missing
some significant relationships and that minor adjustments
in the model were needed. Thus, to find unmodeled paths
that have both statistical and theoretical importance to the
HAQ-DI model (13), modification indices were inspected.
A modification index is a statistic that displays how much
model fit will be improved by adding a new path to the
model. In most models, paths can be added as unidirec-
tional (i.e., regression paths) or bidirectional (i.e., correla-
tional). Because the current model contained only a single
factor, additional paths could only be added as correla-
tions, specifically correlated residuals (42).
Model refinement: motor skills subdomain. Residuals
refer to the variance that is not accounted for by the rela-
tionship of a particular domain to its latent variable. For
example, the residual of the domain Grip is all of the
variance not otherwise accounted for by the path between
Grip and Health Assessment, or 1–0.76 for standardized
values (Figure 1). This residual value is influenced by
multiple other sources of variance, such as method vari-
ance, shared content beyond the primary factor, and mea-
Table 2. Factor matrix for the one-factor solution*
Dressing and grooming
Percent of total variance
* HAQ ? Health Assessment Questionnaire; h2? communality for
maximum likelihood estimation extraction; A ? excellent loading.
HAQ-DI Factor and Scoring Validation539
surement error (42). Hence, a correlation between 2 resid-
uals occurs when aspects of these residual terms are
strongly related, although correlations between residuals
are not generally assumed to arise from correlated mea-
surement error, as this should be random (43). The first
examination of fit indices revealed relatively high scores
between the residuals for Reach and Eating (modification
index ? 11.88), Grip and Eating (modification index ?
14.63), and Arising and Walking (modification index ?
14.61), resulting in correlations of r ? 0.22, 0.29, and 0.29,
respectively. All of these correlated residuals appear to
have a content relationship in that each focuses on motor
skills. Thus, after determining the HAQ score and disease-
specific QOL, these data suggest that the additional impact
on motor skills can be assessed by examining pairs of
scores on Arising and Walking, then Grip and Eating, and
finally Hygiene and Eating.
A second examination of modification indices was un-
dertaken, after these 3 additions of the correlated residuals
of the motor skills subdomain were added to the model.
This second round of modification indices indicated that
one more addition should be made by correlating the re-
siduals between Grip and Hygiene (modification index ?
12.20). However, it should be noted that the correlation
between residuals for Grip and Hygiene was negative
(?0.40). Whereas the other correlated residuals have a
more logical interpretation, interpreting negatively corre-
lated residuals between Grip and Hygiene is more elusive
and should be examined further with other measures of
manual dexterity and hygiene. Moreover, the Grip-Hy-
giene residual correlation was only appreciable once the
previous correlated residuals were added, suggesting that
the residuals of Grip and Hygiene have a complicated
relationship to the first 3 combined correlated residuals.
No more sufficiently large correlated residuals were in-
dicated, and therefore the model was rerun to test the new
fit indices. The modified CFA model generated satisfactory
fit statistics for all model fit criteria (Table 3). Figure 1
shows the final factor structure of the HAQ-DI, including
the standardized factor loadings for the HAQ-DI latent
variable on each of the HAQ-DI domains, as well as the
level of standardized correlation between the domains.
The fit for this model provides substantial evidence for the
use of a single total score on the HAQ-DI.
The current study was the first to assess the latent struc-
ture of the HAQ-DI with rigorous methodologic tactics.
Although prior EFA had been performed on the HAQ
domains (18), those findings did not test the adequacy of
how well their results fit their data given the limitations of
the EFA. Beyond providing a confirmatory analysis of the
HAQ’s latent structure, this study also presented latent
analysis in a 2-step cross validation. Because factor analy-
sis is a sample-dependent technique, the validity of a
factor structure must be tested on an independent sample
for one to have confidence in the results.
The latent cross validation with EFA and CFA provides
much support for the current scoring system of the HAQ-
DI. Knowledge and validation of the latent structure of a
measure is inextricably tied to the knowledge and valida-
tion of a scoring system for a measure. Hence, these data
provide important and necessary validation for the way in
which the HAQ-DI is scored. Although the results do not
suggest that a new scoring for the HAQ-DI is required, new
clinical data are provided to support secondary interpre-
tations of the HAQ-DI based on residual correlations
within a motor skills subdomain. However, caution in the
interpretation of these pairs of scores within this model is
needed. The total-score interpretation of the HAQ-DI is
psychometrically the most appropriate interpretation of
domain scores; secondary interpretations based on corre-
lated residuals must only be done as embellishment and
not as a replacement to the HAQ-DI total score. Second,
the correlations between the residuals are moderate at
best, and offer only a bit of useful information beyond the
HAQ-DI total score (albeit, enough to mandate inclusion in
the HAQ-DI model). Third, further validation of the corre-
lated residuals should be undertaken before regular sec-
ondary interpretation of these factors is conducted. Such
validation would require a new study that specifically
tests the correlation interpretation, often necessitating the
inclusion of additional variables in the model from mea-
sures of similar and dissimilar content (44).
A possible limitation to the current study is that the
items for each HAQ-DI domain differ from person to per-
son. This is a necessary and expected aspect of the HAQ-DI
and all related psychometric evaluations of the HAQ-DI,
because HAQ-DI scoring criteria require one to use the
score of the highest item to create the score for the HAQ-DI
domains. The influence of this aspect of the HAQ-DI
should also be validated, and could be done within a
hierarchical structural model. Unfortunately, this valida-
tion would require an immense and diverse sample that is
rarely available in the study of RA.
Two key areas can be addressed in future research: the
viability of the HAQ in other frequently assessed popula-
tions and determination of further scoring system informa-
tion. The current study examined the HAQ with a sample
exclusively with persons diagnosed with RA. However,
the HAQ also is used frequently to determine the disease-
Table 3. Fit statistics for all structural models*
ModelGFIAGFICFI RMSEA RMSEA 90% CI
Single-factor (?2? 20.62; 16 df)†
0.97 0.920.990.04 0.00–0.09
* GFI ? Goodness-of-Fit; AGFI ? Adjusted Goodness-of-Fit; CFI ? Conformed Fit Index; RMSEA ? Root Mean Square Error of Approximation; 90%
CI ? 90% confidence intervals; df ? degrees of freedom.
† P ? 0.05.
540 Cole et al
specific QOL in other populations, such as those with
osteoarthritis, systemic lupus erythematosus, and other
musculoskeletal conditions. At this time, there is no em-
pirical evidence to suggest that the data obtained herein
are necessarily generalizable to these other disease condi-
tions (45). Byrne (46) has recommended that prior to the
examination of similarity in CFA results for a measure
across various subgroups, one should first determine the
latent structure of the test on a single and appropriate
sample. The current study supplies such information.
Hereafter, it would be beneficial for other research to both
affirm the latent structure of the HAQ for other disease
populations and measure the consistency between those
groups and an RA group (46).
A second method to further examine the HAQ scoring
system is to use item response theory (IRT) (47). IRT
weights each item so that items that indicate the strongest
impact on health assessment receive stronger weights. A
common IRT model used for this analysis is the Rasch
model, which only examines the disease severity of each
item in estimating the overall score for an individual (48).
Like many IRT models, the Rasch model requires unidi-
mensionality (i.e., a single-factor model). CFA is often
used as a tool for determining the unidimensionality as-
sumption in IRT (17) and the current study provides evi-
dence that a single-factor model is appropriate, therefore
suggesting that a Rasch model may work for the HAQ-DI.
However, the Rasch model also asserts that the residuals of
each item should be uncorrelated (49). Therefore, careful
examination of the unidimensionality assumption is nec-
essary to examine the HAQ-DI with IRT, including consid-
ering alternative IRT models (50).
In summary, this study provides psychometric evidence
of the structural validity of the HAQ-DI within an RA
population. Considering the widespread use of the HAQ-
DI, it is important to demonstrate the psychometric stabil-
ity and validity of the measure. By integrating EFA with a
subsequent CFA, the current study demonstrates the va-
lidity of using the HAQ-DI total score as an estimate of
disability in RA. Of course, as with all studies of validity,
no one study can summarily prove the validity of a mea-
sure, as this must be done through a program of research.
In the future, it would be of interest to determine whether
the structural validity of the HAQ-DI extends to other
1. Fries JF, Spitz P, Kraines RG, Holman HR. Measurement of
patient outcome in arthritis. Arthritis Rheum 1980;23:137–45.
2. Kumar A, Malaviya AN, Pandhi A, Singh R. Validation of an
Indian version of the Health Assessment Questionnaire in
patients with rheumatoid arthritis. Rheumatology (Oxford)
3. El Meidany YM, el Gaafary MM, Ahmed I. Cross-cultural
adaptation and validation of an Arabic Health Assessment
Questionnaire for use in rheumatoid arthritis patients. Joint
Bone Spine 2003;70:195–202.
4. El-Miedany Y, Youssef S, el-Gaafary M, Ahmed I. Evaluating
changes in health status: sensitivity to change of the modified
Arabic Health Assessment Questionnaire in patients with
rheumatoid arthritis. Joint Bone Spine 2003;70:509–14.
5. Bathon JM, Martin RW, Fleischmann RM, Tesser JR, Schiff
MH, Keystone EC, et al. A comparison of etanercept and
methotrexate in patients with early rheumatoid arthritis [pub-
lished erratum appears in N Engl J Med 2001;344:240 and
N Engl J Med 2001;344:76]. N Engl J Med 2000;343:1586–93.
6. Lipsky P, van der Heijde D, St. Clair W, Smolen J, Furst D,
Kalden J, et al. 102-wk clinical & radiologic results from the
ATTRACT trial: a 2 year, randomized, controlled, phase 3
trial of infliximab (Remicade®) in pts with active RA despite
MTX [abstract]. Arthritis Rheum 2000;43:S269.
7. Weinblatt ME, Keystone EC, Furst DE, Moreland LW, Weisman
MH, Birbara CA, et al. Adalimumab, a fully human anti–tumor
necrosis factor ? monoclonal antibody, for the treatment of rheu-
ARMADA trial. Arthritis Rheum 2003;48:35–45.
8. Wolfe F, Hawley DJ. The longterm outcomes of rheumatoid
arthritis: work disability: a prospective 18 year study of 823
patients. J Rheumatol 1998;25:2108–17.
9. Wolfe F. The determination and measurement of functional
disability in rheumatoid arthritis. Arthritis Res 2002;4 Suppl
10. Wolfe F, Michaud K, Gefeller O, Choi HK. Predicting mortal-
ity in patients with rheumatoid arthritis. Arthritis Rheum
11. Escalante A, del Rincon I, Cornell JE. Latent variable ap-
proach to the measurement of physical disability in rheuma-
toid arthritis. Arthritis Rheum 2004;51:399–407.
12. Pincus T, Summey JA, Soraci SA Jr, Wallston KA, Hummon
NP. Assessment of patient satisfaction in activities of daily
living using a modified Stanford Health Assessment Ques-
tionnaire. Arthritis Rheum 1983;26:1346–53.
13. Schumacker RE, Lomax RG. A beginner’s guide to structural
equation modeling. Mahwah (NJ): Lawrence Erlbaum; 1996.
14. Daleo DV, Lopez BR, Cole JC, Kaufman AS, Kaufman NL,
Newcomer BL, et al. K-ABC simultaneous processing, DAS
nonverbal reasoning, and Horn’s expanded fluid-crystallized
theory. Psychol Rep 1999;84:563–74.
15. Kaufman AS. Intelligent testing with the WISC-III. New York:
16. Radloff LS. The CES-D scale: a self-report depression scale for
research in the general population. Appl Psychol Meas 1977;
17. Cole JC, Rabin AS, Smith TL, Kaufman AS. Development and
validation of a Rasch-derived CES-D short form. Psychol As-
18. Daltroy LH, Phillips CB, Eaton HM, Larson MG, Partridge AJ,
Logigian M, et al. Objectively measuring physical ability in
elderly persons: the Physical Capacity Evaluation. Am J Pub-
lic Health 1995;85:558–60.
19. Preacher KJ, MacCallum RC. Repairing Tom Swift’s electric
factor analysis machine. Underst Stat 2003;2:13–43.
20. Cole JC, Oliver TM, McLeod JS, Ouchi BO. Cross validating
the latent structure of Accuplacer: a factor analytic approach.
Res Schools 2003;10:63–70.
21. Floyd FJ, Widaman KF. Factor analysis in the development
and refinement of clinical assessment instruments. Psychol
22. Paulus HE, Oh M, Sharp JT, Gold RH, Wong WK, Park GS, et
al, and the Western Consortium of Praticing Rheumatologists.
Correlation of single time-point damage scores with observed
progression of radiographic damage during the first 6 years of
rheumatoid arthritis. J Rheumatol 2003;30:705–13.
23. Paulus HE, Wiesner J, Bulpitt KJ, Patnaik M, Law J, Park GS,
et al. Autoantibodies in early seropositive rheumatoid arthri-
tis, before and during disease modifying antirheumatic drug
treatment. J Rheumatol 2002;29:2513–20.
24. Arnett FC, Edworthy SM, Bloch DA, McShane DJ, Fries JF,
Cooper NS, et al. The American Rheumatism Association
1987 revised criteria for the classification of rheumatoid ar-
thritis. Arthritis Rheum 1988;31:315–24.
25. Van der Heijde DM, van ’t Hof MA, van Riel PL, Theunisse
LA, Lubberts EW, van Leeuwen MA, et al. Judging disease
activity in clinical practice in rheumatoid arthritis: first step
in the development of a disease activity score. Ann Rheum
HAQ-DI Factor and Scoring Validation541
26. Blalock SJ, DeVellis RF, Brown GK, Wallston KA. Validity of Download full-text
the Center for Epidemiological Studies Depression scale in
arthritis populations. Arthritis Rheum 1989;32:991–7.
27. BruceB, FriesJF.The
Questionnaire: dimensions and practical applications. Health
Qual Life Outcomes 2003;1:1–6.
28. Schafer JL, Graham JW. Missing data: our view of the state of
the art. Psychol Methods 2002;7:147–77.
29. Schafer JL. NORM. Version 2.03. url: http://www.stat.psu.
30. Rubin DB, Schenker N. Multiple imputation in health-care
databases: an overview and some applications. Stat Med
31. Cole JC, Motivala SJ, Dang J, Lucko A, Lang N, Levin MJ, et al.
Structural validation of the Hamilton Depression Rating
Scale. J Psychopathol Behav Assess 2004;26:241–54.
32. Bollen K, Stine RA. Bootstrapping goodness-of-fit measures in
structural equation models. Sociol Methods Res 1992;21:205–
33. Nevitt J, Hancock GR. Improving the root mean square error of
approximation for nonnormal conditions in structural equa-
tion modeling. J Exp Educ 2000;68:251–68.
34. Comrey AL, Lee HB. A first course in factor analysis. 2nd ed.
Hillsdale (NJ): Lawrence Erlbaum; 1992.
35. Arbuckle JL. Amos. Version 4.02. Chicago: Small Waters;
36. Bentler PM, Bonett DG. Significance tests and goodness-of-fit
in the analysis of covariance structures. Psychol Bull 1980;
37. Hu LT, Bentler PM. Fit indices in covariance structure
modeling: sensitivity to underparameterized model misspeci-
fication. Psychol Methods 1998;3:424–53.
38. Hu LT, Bentler PM. Cutoff criteria for fit indexes in covariance
structure analysis: conventional criteria versus new alterna-
tives. Struct Equation Model 1999;6:1–55.
39. Arbuckle JL, Wothke W. Amos 4.0 user’s guide. 4.01 ed.
Chicago: Small Waters; 1999.
40. Marcoulides GA. Introduction to structural equation model-
ing. In: Annual meeting of the American Educational Re-
search Association, 1998; San Diego (CA): American Educa-
tional Research Association; 1998.
41. Cohen J. Statistical power analysis for the behavioral sci-
ences. Hillsdale (NJ): Lawrence Erlbaum; 1988.
42. Palmer RF, Graham JW, Taylor B, Tatterson J. Construct va-
lidity in health behavior research: interpreting latent variable
models involving self-report and objective measures. J Behav
43. Anastasi A, Urbina S. Psychological testing. 7th ed. Upper
Saddle River (NJ): Prentice Hall; 1998.
44. Wothke W. Models for multitrait-multimethod matrix analy-
sis. In: Marcoulides GA, Schumacker RE, editors. Advanced
structural equation modeling: issues and techniques. Mah-
wah (NJ): Lawrence Erlbaum; 1996. p. 7–56.
45. Haynes SN, Richard DC, Kubany ES. Content validity in psy-
chological assessment: a functional approach to concepts and
methods. Psychol Assess 1995;7:238–47.
46. Byrne BM. Structural equation modeling with AMOS: basic
concepts, applications, and programming. Mahwah (NJ): Law-
rence Erlbaum; 2001.
47. Hambleton RK, Swaminathan H, Rogers HJ. Fundamentals of
items response theory. Newbury Park (CA): Sage; 1991.
48. Wright BD. A history of social science measurement. Educ
Meas Issues Pract 1997;16:33–45.
49. Linacre JM. Structure in Rasch residuals: why principal com-
ponents analysis? Rasch Meas Trans 1998;12:636.
50. Van der Linden WJ, Hambleton RK, editors. Handbook of
modern item response theory. New York: Springer; 1997.
542Cole et al