The Performance of Different Lookback Periods and Sources of Information for
Charlson Comorbidity Adjustment in Medicare Claims
James X. Zhang; Theodore J. Iwashyna; Nicholas A. Christakis
Medical Care, Vol. 37, No. 11. (Nov., 1999), pp. 1128-1139.
Medical Care is currently published by Lippincott Williams & Wilkins.
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained
prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in
the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
The JSTOR Archive is a trusted digital repository providing for long-term preservation and access to leading academic
journals and scholarly literature from around the world. The Archive is supported by libraries, scholarly societies, publishers,
and foundations. It is an initiative of JSTOR, a not-for-profit organization with a mission to help the scholarly community take
advantage of advances in technology. For more information regarding JSTOR, please contact email@example.com.
Tue Dec 4 16:57:56 2007
Volume 37, Number 11,pp 1128-1139
01999 Lippincott W~lliams & Wilkins, Inc
The Performance of Different Lookback Periods and Sources
of Information for Charlson Comorbidity Adjustment in
JAMES X. ZHANG, PHD,*THEODORE
MD, PHD, MPH*§~~
ularly popular form of comorbidity adjust-
ment in claims data analysis. However, the
effects of certain implementation decisions
have not been empirically examined.
OBJECTWE.To determine the effects of alterna-
tive data sources and lookback periods on the
performance of Charlson scores in the prediction
of mortality following hospitalization.
SUBJECTS. A representative sample of 1,387 el-
derly patients hospitalized in 1993, drawn from
the Medicare Current Beneficiary Survey
(MCBS). Three years of linked Medicare claims
and survey instruments were available for all
patients, as was 2-year mortality follow-up.
S T A ~ ~ ~ C A L Nested Cox regression METHODS.
and comparisons of areas under the Receiver
Operating Characteristic (ROC) curve were used
to evaluate ability to predict mortality.
RESULTS. Compared with a 1-year lookback
involving solely inpatient claims, statistically
and empirically significant improvements in
the prediction of mortality are obtained by
The Charlson Score is a partic-
Among the most popular comorbidity indices in
claims data research are those based on the work
*From the Department of Medicine, the University of
Chicago, Chicago, Illinois.
'~rom the Harris School of Public Policy, the Univer
sity of Chicago, Chicago, Illinois.
f ~ r o m the Pritzker School of Medicine, the University
of Chicago, Chicaao, Illinois.
§From the Population Research Center, the Univer-
s i t of Chicago, Chicago, Illinois.
From the Department of Sociology at the Universih;
of Chicago, Chicago, Illinois.
incorporating alternative sources of data (par-
ticularly 2 years of inpatient data and 1year of
outpatient and auxiliary claims), but only if
indices derived from distinct sources of data
are entered into the regression distinctly. The
area under the ROC curve for 1-year mortality
predication increases from 0.702 to 0.741 (P =
0.002). Furthermore, these improvements in
explanatory power obtained whether one also
controls for Charlson scores based on self-
reported health history andlor secondary diag-
noses from the claim for the index hospitaliza-
tion itself. Finally, claims-based comorbidity
adjustment performs comparably to survey-
derived adjustment, with areas under the ROC
curve of 0.702 and 0.704, respectively.
CONCLUSIONS. The widespread
comorbidity adjustment in pre-existing admin-
istrative data sources can be improved by tak-
ing more complete advantage of existing ad-
ministrative data sources.
Key Words: Medicare; co-morbidity; data
quality. (Med Care 1999;37:1128-1139)
of Mary Charlson et al,' particularly as imple-
mented in the International Classification of Dis-
This work was supported by a grant from the Alzhei-
mer's Association (TRG-95-033), the National Institute
on Aging (1 R01 AG15326-01) (NAC), and by a Medical
scientist ~ ~ ~
National Institutes of Health (5 T32 GM07281) (TJI).
Address correspondence to: Nicholas A. Christakis,
MD, PhD, MPH, Section of General Internal Medicine,
The University of Chicago Medical Center, 5841 S.
MC 2007, =hicago, IL 60637. ~ . ~ ~ i l :
Received October 21, 1998; initial review completed
December 30, 1998; accepted May 25, 1999.
i ~ from the
Vol. 37, No. 11
SOURCES OF INFORMATION AND CHARLSON COMORBIDITY ADJUSTMENT
eases, 9th Revision, Clinical Modification (ICD-9-
CM) codes for computerized use.2-4 While several
alternative risk-adjustment approaches have also
been published,j-9 the Charlson method is ex-
tremely p~pular.~,l~-l~ Direct comparisons be-
tween different comorbidity measures are rela-
tively rare, however.14-17 In general, these indices
have been developed to predict mortality follow-
ing hospitalization (a pattern to which we will
adhere), although alternative outcomes2J4,lB,lY
and settings20-'"have also been evaluated.
In implementing comorbidity adjustment for
mortality risk following hospitalization, many
practical issues have been decided on the basis of
convenience, experience, judgement, and data
availability, rather than on an empirical examina-
tion of the effects of these decisions on the per-
formance of the comorbidity index in question.
Two areas have received considerable attention: (1)
the difficulties caused by the disease coding
schemes used in administrative databases26-28 and
related data-quality issues,lj,29-31 and (2) the
question of study-specific reweighting of the
Charlson However, a number of other
important issues are only beginning to be ad-
dressed, particularly related to the amount of
longitudinal data that should be collected on each
individual. Thus, researchers working with Califor-
nia's Office of Statewide Health Planning and
Development discharge abstract data regularly
perform risk adjustment without the ability to link
to any previous claims,Z7 whereas those using
Medicare claims data may utilize all inpatient
claims for several years.2,"13 Similarly, in the de-
velopment of incidence cohorts, the appropriate
"lookback" time (that is, the amount of retrospec-
tive surveillance necessary to ensure that the dis-
ease is incident and not prevalent) has been
carefully examined in cancer,33 although not in
other diseases and not for risk-adjustment pur-
poses. Study has also begun on the value of
administrative sources of data other than in~atient
claims (eg, with respect to cancer incidence,3"
mentary survey-derived data25,3h-").
Here, we take advantage of the longtudinal,
individually linked inpatient, outpatient, and phy-
sician claims available in the Medicare Current
Beneficiary Survey. For a cohort of patients hospi-
talized in 1993, we examine the impact of alterna-
tive lengths of lookback (1 vs. 2 years) and of
alternative data sources (inpatient claims only vs.
inpatient plus "outpatient" and "auxiliary" claims
or the use of supple-
vs. all these claims plus self-report) on the perfor-
mance of the Charlson score with respect to
mortality following hospital admission. We specif-
ically evaluate two different ways the comorbidity
information may be combined statistically; that is,
we evaluate whether it makes a difference if the
data from different sources (eg, inpatient and
outpatient claims) are combined into a single
overall Charlson score (the usual approach), or,
alternatively, are kept distinct, with separate
Charlson scores developed for each data source
and entered into regression models as distinct
vectors of covariates.
Our general approach is to assume that most
researchers study patients initially identified in
inpatient claims. We consider the marginal value
of additional sources of other earlier claims-based
information once one already has controlled for
comorbidity levels detected in the earlier inpatient
claims. This imposes an a priori hierarchy on the
claims, looking first at inpatient claims, second at
outpatient claims, and finally at auxiliary claims.
We also consider the value of two other sources of
information to supplement inpatient claims-based
lookbacks: the use of self-reported data on medi-
cal history and secondary diagnoses present on
the claim for the index hospitalization.
Sources of Data
The cohort was drawn from the 1991 cohort of
the MCBS. This nationally-representative sample
of approximately 13,000 Medicare beneficiaries
drawn in 1991 continues to be maintained, as
described elsewhere.39 The MCBS contains quar-
terly survey data linked to all Medicare claims
(including inpatient, outpatient, and auxiliary ser-
vice claims) filed during the calendar year during
which the subject is followed in the survey. We
have developed a panel data set by linking the
releases of the MCBS from 1991 through 1994.
Our study cohort consisted of all individuals
aged 65 years and older in the original MCBS
sample (in 1991) who were hospitalized in 1993.
Subjects entered the study cohort upon admission
to the hospital for the first time in the 1993
calendar year, their "index hospitalization." For
those hospitalized more than once in 1993, sub-
sequent hospitalizations were ignored. We re-
quired that cohort members be at least 67 years
ZHANG ET AL
old in 1993 to allow for 2 years of previous
Medicare claims to be inspected for the develop-
ment of the comorbidity indicators. In- and out-
of-hospital mortality follow-up was available
through January 1,1995, for all cohort members, at
which point survival was censored.
Construction of Charlson Comorbidity
Claims-Based Algorithm. For each time pe-
riod (eg, a l-year lookback and a second-year
lookback), a comorbidity score was generated for
each cohort member by searching through the
entire MCBS Medicare files; these include "inpa-
tient" claims (hospital), "outpatient" claims (which
in HCFA terminology are claims for outpatient
care filed by institutional providers), and "auxilia-
ry" service claims (physicianlsupplier, skilled nurs-
ing facility, home health aid, hospice, and durable
medical equipment). Traditional, office-based out-
patient care is typically billed in the physician1
supplier claims; however, tests indicated that dis-
tinguishing physicianlsupplier claims from other
auxiliary claims did not increase the explanatory
power of the models (data not shown).
The algorithm we used to search the claims and
to assign Charlson scores is a minor variant of the
Deyo? and Romano3 methods; in particular, we
employed ICD-9-CM condition codes appearing
in either method, but excluded the procedure
codes advocated by Romano. Two lookback peri-
ods were established: a l-year lookback (days,
1-365) and a second prior-year lookback (days,
366-730). Day 1for the lookback periods is the
day preceding admission for the index hospitaliza-
tion. The following abbreviations are defined in
Table 1and used in the tables and figures to clarify
the way Charlson measures are constructed. "In
(1)"is a Charlson score based on the inpatient
claims from the 365 days preceding the index
hospitalization. "In (2)" is a Charlson score based
on the diagnoses present on inpatient claims from
only the second prior year of inpatient claims,
regardless of the diagnoses present in the first year
or in any other data source. Likewise, "Out (1)" is
the Charlson score based on a 1-year lookback in
the institutional outpatient claims, and "Aux (1)"is
the Charlson score based on a 1-year lookback in
all other claims, the so-called "auxiliary" claims.
We also constructed a Charlson score based on the
secondary diagnoses (up to nine are recorded in
the claims) from the index hospitalization ("Sec").
Self-Report-Based Algorithm. The MCBS
contains questions allowing for a self-reported
history of the following diseases: "hardening of
the arteries,"myocardial infarction, angina, stroke,
brain hemorrhage, cancer, diabetes, rheumatoid
arthritis, Alzheimer's disease, emphysema,
asthma, chronic obstructive pulmonary disease,
and partial paralysis. Some of these questions
correspond to certain Charlson score categories,
and points were assigned as appropriate. Because
severity could not be determined from the MCBS
questionnaire, diseases with multiple severity lev-
els in the Charlson system were assigned to the
lowest severity level.
Modeling Methodologies. Cox regression
was used to model the effects of alternative co-
morbidity measures. All regressions control for
race (white vs. non-white), gender, and age (cap-
tured as age and age squared). Furthermore, the
primary diagnoses of the index hospitalization
were categorized into 18 categories, which were
formed for consistency with previous typologies40
and to ensure that no one group was too small
(the results were not sensitive to the particular
categorization employed for the index hospitaliza-
tion primary diagnosis [data not shown]). These
diagnostic categories were treated as "nuisance
parameters" in the estimation of the Cox models,
allowing for maximal flexibility without requiring
proportionality in the shape of the hazard function
across disease categoriesH; in doing so, however,
separate coefficients are not estimated for these
variables. Selected Cox-regression likelihood ratio
X2 statistics are presented and are denoted as G2.
The likelihood ratio X2 statistics can be converted
into an R2 analog by using the formula
in which 12 = 1,387
the difference in the G2between models, denoted
AG2,has a X2 distribution with as many degrees of
freedom as there are differences in the number of
covariates between the models.
Presented ROC curves are based on the predic-
tion of 365-day mortality following admission
using the same covariates as the Cox regre~sion.~~
Statistical comparisons were performed using
For nested Cox models,
Vol. 37, No. 11
SOURCES OF INFORMATION AND CHARLSON COMORBIDITY ADJUSTMENT
TABLE 1. Simple Statistics on Select Comorbidity Measures
Year-l inpatient lookback ["In (li"] ?
Inpatient lookback for year-2 ("In (2)"]* ?
Year-1 outpatient lookback ["Out (I)"] ?
Year-l auxiliary lookback ["Aux(li"] ?
In (1) + In (2) + Out (1)+ Aux (1) ?
Self-reported disease history ["Se!f'] ?
2" Diagnoses from index hospitalization ["Sec"] ?
Maximum: In (1) + In (2) + Out (1) + Aux (1)+ Sec + Self
Mean SD Maximum
Note: This table contains the number of individuals for whom each Charlson score was observed, and the mean,
standard deviation, and maximum Charlson score realized in the claims of those patients who had at least one
claim in the respective data sources.
*This is a Charlson score based on the 730th through the 366th day before admission.
ROCKIT, (University of Chicago, Department
of Radiology. Chicago, Illinois. www-radiology.
uchicago.edu/sections/roc).44~45 Probabilities are
reported for one-sided comparisons for the statis-
tically significant increase in the area under the
ROC curve. The conventional probability levels of
significance (P 5 0.1 worthy of report; P
significant) were used.
Interpretation of Statistical Tests. Compar-
isons of three types are made across models.
ROC analysis is used to compare models that
contain different (non-nested) sets of variables
based on logistic regression predicting mortality
within 365 days of hospitalization. This has the
virtue of easy comparability across models and
familiarity. However, logistic regression cannot
make full use of the detail of the mortality data
that is available; thus, we also use Cox regres-
sion to capture all the information about when
patients die. For two distinct purposes, we use
G2 and an R2 analog when examining Cox
models. We use G2 to allow formal statistical
comparisons of nested models (ie, comparisons
between two regressions in which the covariates
of one model are a subset of the other model).
This is analogous to the use of F-tests in ordi-
nary least-squares regression. To compare non-
nested models, we provide R2 analogs, which,
while often appearing trivially small for Cox
regression models, nevertheless allow the com-
parison of relative magnitudes. In summary,
while each indicator is imperfect in some way,
we use triangulation across all three to present
the best-supported analysis of the data.
Parameterization of the Charlson Score: In-
dicator Variables Used. In all cases, an
indicator-variable approach was taken when in-
cluding the Charlson score in regressions, as has
been suggested elsewhere."J In practice, this
means that a set of dummy variables was con-
structed for each patient for each Charlson score
value; if their Charlson score was equal to 2, then
the dummy for "Charlson is 2"was set to 1,and all
others (eg, the dummies "Charlson is unob-
served,""Charlson is observed to be zero," "Charl-
son is 1,""Charlson is 3" "Charlson is 4 or great-
er") were set to zero. Two differences with
previous work are important to note. First, we
distinguished between individuals without any
claim filed during the lookback window ("unob-
served Charlson"), and those for whom at least
one claim was filed but on which no Charlson
diseases were indicated ("observed Charlson of
zero"). In past work, these groups often appear to
be combined and assigned a Charlson value of
zero. Second, because of the relatively small num-
ber of individuals who had Charlson scores of four
and greater, these higher values were combined
into a single category.
We also tested a linear, continuous specification
of the Charlson score and found with one excep-
tion the same patterns reported later. With the
linear specification, the second year of inpatient
data appears to be less valuable than with the
specification employed here (data not shown).
Parameterization of the Charlson Score:
"Single" Versus "Separate" Vectors. Finally,
we tested two alternative ways to incorporate
ZHANG ET AL
alternative sources of data and lookback periods.
In the first method, a model was specified which
combined all data sources into a single Charlson
score without regard to the data source in which a
constituent disease was detected. In such models,
there were a total of five variables indicating levels
of the Charlson score, as explained earlier. Thus,
Cox regression models took the form:
lnh= PI . Dx + p, . Dem + P, . C
in which Dx is a vector of index hospitalization
primary-diagnosis indicator variables treated as a
nuisance parameter (so PI is not explicitly estimat-
ed), Dem is a vector of demographics variables,
and C is a set of five indicator variables for the
levels of the Charlson score. This is the "single
vector" Charlson specification.
An alternative approach allows separate Charl-
son scores based on each data source and enters
them into the regression separately, as in:
In h = PI . Dx + p2 Dem + P, - Ci,
in which Ci, is a vector of five indicator variables
for the level of a Charlson score based on inpatient
data, C,,, is a vector of five indicator variables for
the level of a Charlson score based on outpatient
data, and C,,, is a vector of five indicator variables
for the level of a Charlson score based on auxiliary
claims data. This is the "separate vector" Charlson
score specification. Note that, in this specification,
a single Charlson-diagnosis (eg, chronic obstruc-
tive pulmonary disease) could contribute to the
score of both Ci, and C,,, if it was noted in both
the inpatient and outpatient claims.
Descriptive Statistics of Study Cohort
Of the elderly subjects (> 67 years in 1993) in
the MCBS, 1,387 were hospitalized at least once in
1993. Their mean age was 78.2 years (standard
deviation: ? 7.7) and 38.8% were male, 86.7%
were white, and 158 (11.4%) had died by January
1, 1995. The mean Charlson score assigned to
those who had a claim of each type is presented in
Table 1 for each data source. In general, scores
developed from distinct data sources have moder-
ate correlations, typically in the 0.25 to 0.50 range
(data not shown).
"Single Vector" Charlson: Negligible Value
of Additional Data Sources
As shown in Fig. 1and Table 2, when data from
different sources are combined into a single Charl-
son index, there is no clearly superior combination
of data sources. A single year of inpatient data
performs as well as a Charlson index based on any
combination of inpatient, outpatient, and auxiliary
claims. All provide an area under the ROC curve of
approximately 0.70 for 1-year mortality, and nearly
Moreover, for these single-index Charlson
scores, the areas under the ROC curve are only
minimally different from the area obtained by
simply adjusting for age, race, sex, and primar):
diagnosis of the index hospitalization (Table 2,
"Single Vector"co1umn). The areas under the ROC
curve are not statistically different between models
with and without a single-vector Charlson score at
conventional levels. The G2 statistic indicates that
the inpatient Charlson scores do significantly in-
crease the explanatory power of the Cox model
(Fig.1,AG2= 53.1 -42.5 = 10.6,5d.f.,P= 0.06).
However, there is no particular advantage to any
multiple-data source, single-vector Charlson rela-
tive to inpatient-only single vector Charlson
scores (Fig. 1).
"Separate Vector" Charlson:
Complementarity of Alternative Sources of
Claims Data and Longer Inpatient
An alternative approach to judging the value of
the different sources of information is to create
separate Charlson indices from each data source
and to enter them into regressions separately. This
amounts to acknowledgng that diseases recorded
in inpatient claims and diseases recorded in other
claims may have different import with respect to
their severity and, hence, should be allowed to
have a different impact on mortality. A similar
argument might be made about time horizon (eg,
with more recently detected diseases in still-living
individuals being more "severe" than long-
standing ones). This more flexible parameteriza-
Vol. 37, No. 1 1
SOURCES OF INFORMATION AND MARLSON COMORBIDITY ADJUSTMENT
G2 for Separate Vector
G2for Slngle Vector
Base Demogaphlcs + 1"Dl
Year-2 Outpatlent Lookback
Year-1 Auxiliary Clams Lookback L_
Year-2 Auxll~ary Clams Lookback
Index Hospltahzatlon 2" Diagnoses
, o m .
FIG.1. Likelihoodratio X2 statistics (G2)from COX regressions with alternative comorbiditymeasures. These likelihood
ratios (G2) for the alternative comorbidity measures show the increase in explanatory power associated with different
data sources and parameterizations.When comparing nested models for which the Charlson scores were entered as
separate vectors, the difference in the G2 is X2 distributed; each data source provides five degrees of freedom.
Non-nested models cannot be directly compared using G2; G2 must be converted to R2 analogs using the formula
R2=1-exp(-G2/1387). Note that the R2 analogs are a monotonic function of G2scores.All models presented control
for the age, race, gender, ("base demographics") and primary diagnosison index hospitalization of the patients.Models
including Charlson scores based on secondarydiagnoses from the index hospitalization are distinguished in the figure
by the gray background on the right, as there are important conceptual difficulties in the interpretation of these data
tion, taking advantage of the implicit information
in data source, reveals that alternative sources of
information do have some value.
As reported earlier, there was a 10.6-point
increase in G2associated with the addition of the
1-year inpatient lookback to a model which only
hospitalization prima~ydiagnosis (Fig. 1, AG2 =
53.1 - 42.5 = 10.6,5 dl., P = 0.06). Using separate
vectors, each alternative source of Charlson scores
within the 1-yearlookback appearsto be detecting
important and different comorbidity.Thus, there is
for demographics and index-
TABLE 2. Performance of Alternative Data Sources as Measured by the Area Under the ROC Curve
S~ngle Vector Sepalate Vectori
Base Demographics + loDlagnos~s[hTo Charlson Score]
Year-1 Inpatlent Lookback ["In (7f"'
Year-1 Inpatsent + Year 1Outpatlent Lookback ["In (1) + Ofit (if"]
Year-1 Inpabent + Year 1 Outpatient + Yea1 1 Auxll~arc Cla1m5
Lookback ("In (1) + Out (1) + Auv (1)'1
Year-1 Inpatient + Year 2 Inpatlent Lookback ["In ( I ) + In (?i"]
In (1)+ In (2) + Out (1) + Out 12)
In (1) + In (2) 1Out (1) T OLI~ (2) i-AUX(1) + XUX (2)
In (1) + Out (1) + Aux (1) + In (2)
In (1) + Out (1) + In (7)
Self-Reported Disease Hlston, ["Self"]
In (1) + Self
In (1) + Out (1) + Aux (1)
In (2) + Selt
2" Diagnoses from Index Hospstallzat~on [' SPC"] + In (1)
Sec + In (1) + Out (1) + Aux (1) + In (2)
Note All models whlch Include Charlson scores also control for the base demograph1c.i and Index
hospstalizahon pnmarv diagnosis
an increase of 12.2 points (Fig. 1, AG2 = 65.3 -
53.1 = 12.2, 5 d.f., P = 0.03) in the G2relative to
the inpatient-only model when both 1-year inpa-
tient and outpatient Charlson scores are included
in the model as separate vectors (this corresponds
to an increase in the R2 analog from 0.038 to
0.046.). There is a further rise in G2 of 9.4 points
when the Charlson based on a 1-year lookback in
the auxiliary claims is added (Fig. 1,AG" = 74.7 -
65.3 = 9.4, 5 d.f., P = 0.09).A similar pattern of
informativeness of different data sources as
gauged by changes in the area under the ROC
curve can also be noted in Table 2 and is shown
visually in Fig. 2. The area under the ROC curve
with demographics, primary diagnosis, and 1-year
inpatient Charlson score was 0.702; with the ad-
dition of the 1-year outpatient and auxiliary
claims, the area increased to 0.724 (P = 0.02).
As shown in Figs. 1and 2 and in Table 2, a
regular pattern was also found across the different
tests for the informativeness of the second year of
data for the different data sources. A second year
of inpatient data was valuable. Howeve]; the use of
the 366-to-730-day lookback within the alterna-
tive (ie, outpatient or auxiliary) data sources did
not improve the performance of that Charlson
score measure. More specifically, as shown in Fig.
1, there was a meaningkid increase in the likeli-
hood ratio X2 statistic in the Cox regression models
comparing a model with the 1-year inpatient
l:aise Positive Fraction
Fit. 2. ROC curves for predicting 365-day mortaliq
from alternative data sources. Displayed are the ROC for
three logistic regression models predicting death within
1-year of admission ior the index hospitalization. All
models control for patient demographics and primary
diagnosis. In the model with multiple data sources for
the Charlson score, each entered the regression as a
separate vector. The area under the ROC curve without
Charlson adjustment is 0.697, with 1-year inpatient
claims based adjustmerit ("In [I]") is 0.702, and 2 distinct
years of inpatient claims, 1year of outpatient claims, and
1year of auxiliary clain~s lead to an area under the ROC
curve of 0.741.
Vol. 37, No. 11
SOURCES OF INFORMATION AND CHARLSON COMORBIDITY ADJUSTMENT
lookback to a model with both the first and the
second year of inpatient lookback (Fig. 1, AG2 =
63.4 - 53.1 = 10.3,5 d.f., P = 0.06). However, there
was no increase in G2 for the addition of the
second year of outpatient lookback to a model that
already controlled for 1-year outpatient lookback,
demographics, and primary diagnosis (Fig. 1, AG2
= 61.1 - 56.5 = 4.6,5 d.f., P = 0.47). Similarly, the
addition of the second year of auxiliary data did
not significantly increase the G2 of any model (eg,
Fig. 1,AG2 = 90.1 - 83.8 = 6.3,10d.f.,P = 0.74).
This overall pattern was confirmed by inspecting
the nonsignificant, individual coefficients for the
additional data in the nested Cox regressions (data
not shown) and by examining the changes in areas
under the ROC curve presented in Table 2 ("Sep-
arate Vector" column). In summary, the addition of
a second year of inpatient data to a model already
containing 1year of inpatient lookback produced a
meaningful difference in the areas under the ROC
curve (area under the ROC increased from 0.702 to
0.720; P = 0.03); however, the addition of a second
year of outpatient data or of auxiliary data did not
produce meaningful or statistically significant
Marginal Detection Efficacy of Data
The marginal detection efficacy of each source
of data for each of the 17 constituent diseases of
the Charlson score is shown in Table 3. The table
is read as follows: 27 patients were found to
have myocardial infarction indicated as a diag-
nosis on an inpatient hospitalization claim for
which the patient was discharged in the 365
days preceding the index hospitalization admis-
sion. An additional nine patients were found to
have such a comorbidity when inspecting the
outpatient claims for the same period. Twenty-
nine additional patients were indicated to have
such a comorbidity in the Auxiliary claims,
which brought the total of patients with a
Charlson Score contribution from myocardial
infarction to 65. However, the relative propor-
tion of new cases identified in each source
varied across diagnoses; thus, the use of addi-
tional sources of data contributed to the Charl-
son score by detecting the constituent disease
The Independent Value of Self-Reported
We also evaluated the contribution of self-
reported comorbidity to the construction of Charl-
son comorbidity indices. When included as an
undifferentiated data source in the single-vector
Charlson Index, self-reported diagnoses-had little
or no value as compared with exclusive claims-
based comorbidity detection. When included in
the regressions as a separate vector of dummies,
the addition of self-report data to a model con-
taining the 1-year inpatient lookback increased G2
by 13.4 points (Fig. 1, AG2 = 66.5 - 53.1 = 13.4,
5 d.f., P = 0.02; R2 increased from 0.038 to 0.047).
The addition of self-report data to a model con-
taining separate vectorkharlsons for 1-year inpa-
tient, outpatient, and auxiliary claims, and a
second-year inpatient lookback increased G2 by
4.7 points (Fig. 1, AG2 = 88.5 - 83.8 = 4.7, 5 d.f.,
P = 0.45).
In the ROC analysis shown in Table 2, self-report
data failed to signhcantly increase the area under
the ROC curve versus regressions containing demo-
graphics, primary diagnosis, and either just 1-year
inpatient-based Charlson or 1-year inpatient, 1-year
outpatient, 1-year auxiliary, and a second-year of
inpatient-based separate-vector Charlson scores.
Conversely and more importantly the claims-based
Charlson scores did increase the area under the ROC
curve relative to a model containing demographics,
primary diagnosis, and a self-report-based Charlson
score; the area increased to 0.743 from 0.704 (P =
0.03), with the addition of the Charlson based on a
1-year inpatient, 1-year outpatient, 1-year auxiliary,
and a second-year of inpatient-based separate vector
Similarity of Pattern When Using Secondary
Diagnoses From Index Hospitalization
When combined into a single Charlson score,
the addition of secondary diagnoses from the
index hospitalization itself significantly increases
the predictive power of 1-year inpatient-only
Charlson score (based on G2 in Fig. 1of 78.2 and
53.1, R2 increases to 0.055 from 0.038). However,
even better performance is achieved in a model
that omits the retrospective inpatient data from
the single Charlson score and uses only the sec-
ondary diagnoses (based on G2 in Fig. 1of 92.2
and 53.1, R2 is 0.064 vs. 0.038).
ZHANG ET AL
TABLE 3. Marginal Detection of Constituent Comorbidities by Alternative Data Sources
Not in Year-1
Year-l Auxiliary But
Not in Year-1 Inpatient
or Outpatient (detected
Congestive heart failure
Peripheral vascular disease
Chronic pulmonary disease
Peptic ulcer disease
Liver disease (mild)
Diabetes (mild or moderate)
Hemiplagia or paraplega
Liver disease (moderate or severe)
Metastatic solid tumor
Note: For each Charlson disease, the margnal number of patients who were found to have that disease by data
source is shown; each column excludes any cases also noted to have that disease in a data source indicated in a
column to its left. The original weights assigned by Charlson et all and used in this study are provided for
This pattern does not hold if the data are treated
in the separate vector specification. In that case, a
pattern similar to that observed in other separate-
vector parameterizations occurs. The addition to
the baseline model (controlling for demographics
and the primary diagnosis of the index hospitaliza-
tion) of a Charlson score based on secondary diag-
noses from the index hospitalization raised likeli-
hood ratio X2 statistic by 49.7 points (Fig. 1,AG2 =
92.2 - 42.5 = 49.7 5 d.f., P < 0.001). The further
addition of the 1-year inpatient Charlson raised the
G' an additional 4.3 points. (Fig. 1,AG2 = 96.5 -
92.2 = 4.3, 5 d.f., P = 0.51). The addtion of other
claims-based Charlson scores andlor the addition of
self-report-based Charlson all also raised the G2,
although not by statistically significant amounts.
Sirmlarly, the addition of the three adhtional claims-
based Charlson scores did increase the area under
the ROC curve from 0.727 (for demographics plus
primary diagnosis plus secondary-diagnosis-based
Charlson plus 1-year inpatient Charlson) to 0.751, a
statistically significant increase in predictive power
(P = 0.01). However, there are interpretive difficulties
in using any of these measures derived from the
secondary diagnoses of the index hospitalization that
are discussed later.
In a representative sample of Medicare benefi-
ciaries, we examined the performance of Charlson
scores based on alternative sources of data. Statis-
tically and empirically significant improvements in
the prediction of mortality can be obtained by
incorporating alternative sources of data (particu-
larly 2 years of inpatient lookback combined with
1year of outpatient and auxiliary claims lookback)
but only if indices derived from distinct sources of
data are entered into the regression distinctly.
Furthermore, we found that these improvements
in explanatory power were largely true whether
Vol. 37, No. 11
SOURCES OF INFORMATION AND CHARLSON COMORBIDITY ADJUSTMENT
one also controlled for Charlson scores based on
self-reported health history andlor based on the
secondary diagnoses from the claim for the index
Surprisingly, our results overall showed that the
Charlson indices provided only modest improve-
ments over simply controlling for the age, race,
sex, and index hospitalization primary diagnosis of
the patients. Among papers that report evalua-
tions of comorbidity adjustment, similarly modest
performance has been reported in some cas-
es,7,17,18 although not always.25 In absolute mag-
nitude, the areas under the ROC curve that we
report are quite similar to those published previ-
ously using an inpatient-only Charlson score to
predict in-hospital mortality among coronary ar-
tery bypass patients.17 More generally, the Charl-
son score has typically been validated by demon-
strating differences in utilization or mortality
between score levels rather than by assessing its
absolute increase in explanatory p ~ w e r . l - ~ , l ~ , ~ ~
While we found that the explanatory power of
the Charlson score could be augmented by the use
of survey-derived self-report of health history, as
has been suggested by previous work using the
SF-36 and other comorbidity
scheme^,^""-^ we also found that an index based
on inpatient claims data alone had approximately
the same explanatory power as an index based on
survey-derived data alone. Although survey-
derived data are often not available, when they
are, they seem to tap somewhat distinct "health"
information as compared with the inpatient
claims; self-report of health history and outpa-
tientiauxiliary claims may be substitutes to each
other. These conclusions, however, are particularly
dependent on the self-reported health history
instrument available. The instrument in the MCBS
was not optimized for the development of Charl-
son scores and superior performance would prob-
ably be obtained with a more focused survey.36
Finally, our results demonstrated that the use of
earlier claims can significantly augment risk ad-
justment using the secondary diagnoses of the
index hospitalization. It is well known that there
are important conceptual difficulties with the use
of secondary diagnoses from the index hospital-
ization to adjust for the prehospitalization level of
comorbidity in a patient population; in particular,
it is impossible to assess whether the secondary
diagnoses from the index hospitalization represent
true pre-existing comorbidities that complicated
the patient's care (and, hence, are appropriate for
significant and well-known limitations of the
Charlson score implemented in administrative
Second, we used a representa-
tive sample of the elderly and looked at their
hospitalizations and consequent mortality; thus,
our population is relatively more healthy than a
representative sample of hospitalizations. Differ-
ent performance characteristics might be found in
different subpopulations. Our use of the MCBS
allowed us to look at many different sources of
data, including self-reported health history; how-
ever, the small size of the data set limited our
ability to perform detailed analyses on restricted
subpopulations. Third, our ability to generate a
Charlson score h-om the self-reported health his-
tory is obviously dependent on the particular
questionnaire that was used. Fourth, we have not
performed an exhaustive search of alternative
specifications (for example, the use of a quadratic
continuous Charlson score instead of our multiple
indicator variable approach) nor for alternative
outcomes (such as inpatient mortality, total re-
source use, or length of stay); naturally, alternative
data sources might perform differently when val-
idated against different outcomes.
Our data do confirm the following: (1) that the
Charlson comorbidity index, in conjunction with
basic demographics, does have explanatory power
to predict mortality following hospitalization, and
(2) that the simple use of additional, readily avail-
able claims data sources can significantly enhance
that explanatory power.
risk adjustment), or rather if they represent the
result of complications and suboptimal treatment
of a patient (and, hence, should be considered
outcomes of care, not comorbiditie~).~
the fraction of time that any individual diagnosis is
a complication rather than a comorbidity may vary
as a function of both the institutions and proce-
dures under study." For the purposes of this
article, we do not need to take a position on this
methodological debate but merely note that if one
chooses to proceed with risk adjustment using
secondary diagnoses, one can still improve the
accuracy of the model by using diverse prior claims
data sources, as well (if the information from
different data sources is incorporated distinctly).
This work is not without its limitations. First
and foremost, we have looked only at ways in
which the conventional ICD-9-CM-based ver-
sions of the Charlson score may be operational-
ized in the claims. Our results are subject to all the
ZHANG ET AL
The authors thank Marshall Chin for his helpful
comments and Kim Thomas for administrative support.
1. Charlson ME, Pompei P, Ales KL, MacKenzie
CR. A new method of classifying prognostic comorbidity
in longtudinal studies: Development and validation.
J Chronic Dis 1987;40:373-383.
2. Deyo RA, Cherkin DC, Ciol MA. Adapting a
clinical comorbidity index for use with ICD-9-CM admin-
istrative databases. J Clin Epidemiol 1992;45:613-619.
3. Romano PS, Roos LL, Jollis JG. Adapting a
clinical comorbidity index for use with ICD-9-CM ad-
ministrative data: Differing perspectives. J Clin Epide-
4. D'Hoore W, Sicotte C, Tilquin C. Risk-
adjustment in outcome assessment: The Charlson co-
morbidity index. Methods Inf Med 1993;32:382-387.
5. Iezzoni LI. Risk adjustment for measuring health
outcomes. Chicago: Health Administration Press,
6. Elixhauser A, Steiner C, Harris DR, Coffey
RM. Comorbidity measures for use with administrative
data. Med Care 1998;36:8-27.
7. Schwartz M. Iezzoni LI, Moskowitz MA, Ash
AS, Sawitz E. The importance of comorbidities in
explaining differences in patient costs. Med Care
8. Iezzoni LI, Daley J, Heeren T, et al. Using
administrative data to screen hospitals for high compli-
cation rates. Inquiry 1994;31:40-55.
9. Kuykendall DH, Ashton CM, Johnson ML,
Geraci JM. Identifying complications and low provider
adherence to normative practice using administrative
data. Health Sen1 Res 1995;30:531-554.
10. D'Hoore W, Bouckaert A, Tilquin C. Practical
considerations on the use of the Charlson comorbidity
index with administrative data bases. J Clin Epidemiol
11. Christakis NA, Escarce JJ. Survival of Medicare
patients following enrollment in hospice programs.
N Engl J Med 1996;335:172-178.
12. Iwashyna TJ, Zhang JX, Lauderdale DS,
Christakis NA. A method for identifying married cou-
ples in the Medicare claims data: Mortality, morbidity
and health care utilization anlong the elderly. Demogra-
13. Roos NP, Wennberg JE, Malenka DJ, et al.
Mortality and reoperation after open and transurethral
resect~on of the prostate for benign prostatic hyperplasia.
N Engl J Med 1989:320:1120-1124.
14. Romano PS, Roos LL, Jollis JG. Further evi-
dence concerning the use of a clinical comorbidity index
with ICD-9-CM administrative data. J Clin Epidemiol
15. Roos LL, Sharp SM, Cohen MM. Comparing
clinical information with claims data: Some similarities
and differences. J Clin Epidemiol 1991;44:881-888.
16. Hughes JS, Iezzoni LI, Daley J, Greenberg L.
How severity measures rate hospitalized patients. J Gen
Intern Med 1996;11:303-311.
17. Ghali WA, Hall RE, Rosen AK, Ash AS,
Moskowitz MA. Searching for an improved clinical
comorbidity index for use with ICD-9-CM administra-
tive data. J Clin Epidemiol 1996;49:273-278.
18. Kieszak SM, Flanders WD, Kosinski AS,
Shipp CC, Karp H. A comparison of the Charlson comor-
bidity index derived from the medcal record data and
administrative data. J Clin Epidemiol 1999;52:137-142.
19. Librero J, Peiro S, Ordinana R. Chronic co-
morbidity and outcomes of hospital care: Length of stay,
mortality, and readmission at 30 and 365 days. J Clin
20. DesHamais SI, McMahon Jr. LF, Wrob-
lewski RT, Hogan AJ. Measuring hospital perfor-
mance: The development and validation of risk-adjusted
indexes of mortality, readmissions, and complications.
Med Care 1990;28:1127-1141.
21. Brailer DJ, Kroch E, Pauly MV, Hunag J.
Comorbidity-adjusted complication risk. Med Care
22. Weiner JP, Starfield BH, Steinwachs DM,
Mumford LM. Development and application of a
population-oriented measure of ambulatory care case-
mix. Med Care 1991;29:452-472.
23. Starfield B, Weiner J, Mumford L, Steinw-
achs D. Ambulatory Care Groups: A categorization of
diagnoses for research and management. Health Sen;
24. Fedson DS, Wajda A, Nicol JP, Hammond GW,
Kaiser DL, Roos LL. Clinical effectiveness of influenze
vaccination in Manitoba. JAMA 1993;270:1956-1961.
25. Fowles JB, Weiner JP, Knutson D, Folwer E,
Tucker AM, Ireland M. Taking health status into
account when setting capitation rates: A comparison of
risk-adjustment methods. JAMA 1996;276:1316-1321.
26. Iezzoni LI, Foley SM, Daley J, Hughes J, Fisher
ES, Heeren T. Comorbidties, complications, and coding
bias: Does the number of diagnosis codes matter in pre-
hcting in-hospital mortality? JAMA 1992;267:2197-2203.
27. Green J, Wintfeld N. How accurate are hospital
discharge data for evaluating effectiveness of care? Med
Care 1993;31:719 -731.
Vol. 37, No. 1 1
SOURCES OF INFORMATION AND CHARLSON COMORBIDITY ADJLlSTMENT
28. Romano PS, Mark DH. Bias in the coding of
hospital discharge data and its implications for quality
assessment. Med Care 1994;32:81-90.
29. Fisher ES, Whaley FS, Krushat M, et al. The
accuracy of Medicare's hospital claims data: Progress has
been made, but problems remain. Am J Pub Health
30. Fowles JB, Lawthers AG, Weiner JP, Gernick
DW, Petrie DS, Palmer RH. Agreement between phy-
sicians' office records and Medicare Part B claims data.
Health Care Financ Rev 1995;16:189-199,
31. Lauderdale DS, Goldberg J. The expanded
racial and ethnic codes in the Medicare data files: Their
completeness of coverage and accuracy. Am J Pub Health
32. Deyo RA. Adapting a clinical comorbidity index
for use with ICD-9-CM administrative data: A response.
J Clin Epidemiol 1993;46:1081-1082.
33. McBean AM, Warren JL, Babish JD. Measur-
ing the incidence of cancer in elderly Americans using
Medicare claims data. Cancer 1994;73:2417-2425.
34. McClellan M, Roghmann H, Schilling P. The
validity of Medicare claims for studies of cancer inci-
dence: Breast, prostate, and lung cancer in the elderly.
Bethesda, MD: SEER-Medicare Data Users Workshop,
June 24, 1998.
35. Klabunde C. Assessment of comorbidity using
claims data. Bethesda, MD: SEER-Medicare Data Users
Workshop, June 24, 1998.
36. Katz JN, Chang LC, Sangha 0, Fossel AH,
Bates DW. Can comorbidity be measured by question-
naire rather than medical record review? Med Care
37. Muhajarine N, Mustard C, Roos LL, Young
TK, Gelskey DE. Comparison of survey and physician
claims data for detecting hypertension. J Clin Epidemiol
38. Robinson JR, Young TK, Roos LL, Gelskey
De. Estimating the burden of disease: Comparing ad-
ministrative sources of data and self-report. Med Care
39. Adler GS. A profile of the Medlcxe current benefi-
ciary swey. Health Care Financ Rev 1994;15:15>163.
40. Collins JG. Prevalence of selected chronic con-
ditions: United States, 1990-1992. National Center for
Health Statistics. Vital Health Stat 1997;10:(194).
41. Allison PD. Survival analysis using the SAS
System: A practical guide. C~N, NC: SAS Institute, 1995.
42. Magee L. R-Squared measures based on Wald
and likelihood ratio joint significance tests. Am Statisti-
43. Metz CE. Basic principles of ROC Analysis. Sem
Nuclear Med 1978;8:283-298.
44. Metz CE, Wang P-L, Kronman HB. A new
approach for testing the significance of differences
between ROC curves measured from correlated data.
In: Deconinck F, ed. Information Processing in
Medical Imaging. The Hague: Martinus Nijhoff,
45. Metz CE, Herman BA, Roe CA. Statistical
comparisons of two ROC curve estimates obtained from
partially-paired data sets. Med
46. Roos LL, Stranc L, James RC, Li J. Compli-
cations, comorbidities, and mortality: Improving clas-
sification and prediction.
47. Iezzoni LI. Assessing quality using administra-
tive data. Ann Intern Med 1997;127:666-674.
Health Serv Res