Content uploaded by Ian Duncan
Author content
All content in this area was uploaded by Ian Duncan on Dec 29, 2014
Content may be subject to copyright.
Content uploaded by Ian Duncan
Author content
All content in this area was uploaded by Ian Duncan on Jan 03, 2014
Content may be subject to copyright.
P
resently, most managed care organizations
(MCOs) manage population risk by applying
disease management or case management pro-
grams. The targeting of specific events or diseases is
a proxy for risk because members who were hospi-
talized or have ≥1 specific (“index”) diseases (such
as asthma, congestive heart failure, or diabetes mel-
litus) experience higher claims costs than the aver-
age for the plan. However, there are many ways of
trying to identify potentially high-cost patients,
including using diseases (such as AIDS or renal fail-
ure) likely to require expensive treatments, previous
hospital or emergency department utilization, or the
level or rate of increase in recent medical expenses.
Different interventions may be applied to each
group depending on the identification method.
Interventions are used on these members with the
objectives of reducing costs and improving outcomes.
This approach has 2 obvious disadvantages. First,
some patients without one of the index diseases are
likely to incur high medical expenses. Second, dis-
eased populations are not homogeneous; thus, mem-
bers whose medical conditions are well controlled
are statistically less likely to experience higher-than-
average costs in the future (and, therefore, are less
likely to benefit from any intervention). The tradi-
tional approach has a third shortcoming: because of
the intensity of resources required to manage the
potentially high-risk members, health plans limit
their interventions either to case management of the
current high-cost patients or to broad, often untar-
geted, disease management programs.
“Population risk management” consists of 3
components:
• Identification of high-risk populations or
subpopulations.
• Use of specific interventions in the high-risk
group to reduce the resource utilization and cost
of the group.
• Application of pricing and underwriting tech-
niques to convey financial signals to plan mem-
bers and sponsors. A group’s health insurance
premiums, for example, could be based on their
estimated future utilization rather than on the
traditional underwriting method of a projection of
historical costs.
VOL. 9, NO. 5 THE AMERICAN JOURNAL OF MANAGED CARE 381
MANAGERIAL
APrediction Model for Targeting
Low-Cost, High-Risk Members of
Managed Care Organizations
Henry G. Dove, PhD; Ian Duncan, BPhil, BA; and Arthur Robb, PhD
Objective: To describe the development and validation of
a predictive model designed to identify and target HMO
members who are likely to incur high costs.
Study Design: Split-sample multivariate regression analysis.
Patients and Methods: We studied enrollees in a 350 000-
member HMO with ≥1 claim in 1998 and 1999. The predic-
tion model uses a combination of clinical and behavioral
vaiables and 1998 and 1999 claims data. The prediction
model was applied and used to rank low-cost patients (1998
cost <$2000) according to their estimated probability of incur-
ring costs ≥$2000 in 1999. For prospective testing, we applied
our models to data that are not available in advance. The same
prediction model was applied to rank a different set of low-
cost patients (1999 cost <$2000) according to estimated prob-
ability of incurring costs ≥$2000 in 2000. Because the
predictions were used for disease management purposes, the
outcomes of a randomly selected control group not intervened
on for the disease management program was analyzed. The
predictive accuracy of the model was tested by comparing the
percentages of “targeted” vs all low-cost patients who incurred
high costs in the subsequent year.
Results: Of the low-cost, top-ranked 1998 patients, 47.8%
incurred high (≥$2000) medical expenses in 1999 vs 14.2% of
randomly selected patients who were low cost in 1998. Of the
top-ranked 1999 patients, 39.7% incurred high costs in 2000
vs 12.2% of the randomly selected low-ranked patients.
Conclusions: The prediction model successfully identifies
low-cost, high-risk patients who are likely to incur high costs
in the next 12 months.
(Am J Manag Care 2003;9:381-389)
From the Division of Health Policy and Administration,
Department of Epidemiology and Public Health, Yale University,
New Haven, Conn (HGD); Lotter Actuarial Partners, Inc, New York,
NY (ID); and LandaCorp, Inc, Montclair, NJ (AR).
This study was suported by Landacorp, Inc, Atlanta, Ga.
Corresponding author: Henry G. Dove, PhD, Division of Health
Policy and Administration, Department of Epidemiology and Public
Health, Yale University, 60 College St, PO Box 208034, New
Haven, CT 06520-8034. E-mail: dove@worldnet.att.net.
It is well known in healthcare that a small per-
centage of the members of a health plan consume a
significant percentage of its resources; it is assumed,
often incorrectly, that the behavior of this minority
is replicated period after period. This assumption is
refuted by the data in Table 1. These data, repre-
senting approximately 209 000 members of a 350 000-
member health plan, show that the highest-cost
members, those costing ≥$25 000 in incurred claims
in 1998 (and, hence, candidates for traditional case
management), represented 1% of the members but
21% of the cost in 1998; in the following year, this
cohort consumed only 7% of total plan costs.
Conversely, enrollees in the lowest-cost medical
class, with costs <$2000 in 1998, accounted for 58%
of all costs in 1999 (Table 1). The phenomenon
being illustrated here, regression to the mean (in
which the resource consumption of most high-cost
patients generally decreases, even in the absence
of any intervention), is well known in health plans
but seems to be overlooked as plans attempt to
find and manage their high-risk members.
In population risk management, a variety of clin-
ical and behavioral variables are used to rank each
patient according to his or her estimated probability
of incurring high medical expenses in a subsequent
period. This article describes the design, develop-
ment, and validation of a prediction model targeting
selected members in a large regional HMO in the
southwest United States. The goal is to identify low-
cost patients (<$2000 in the “base year”) who are
likely to become high-cost patients (in the absence
of any intervention) in the subsequent year.
OTHER COMPONENTS OF
POPULATION RISK MANAGEMENT
Two other components of a population risk man-
agement strategy are interventions (programs that
aim to change patient behavior, healthcare delivery
processes, and patient outcomes through education
and coaching) and pricing and underwriting (tech-
niques that aim to change behavior through price
signals). Selecting appropriate, cost-effective inter-
vention strategies for targeted patients and reflecting
prospective risk in pricing and underwriting deci-
sions (to the extent allowed by regulatory and ethi-
cal constraints) are related challenges that are topics
for future articles.
STUDY POPULATION, DATA SOURCES,
AND GOALS
The prediction model was developed on patients
who were enrolled in a large HMO in 1998 and 1999
and had at least 1 medical or pharmacy claim in
both years. Patient medical claims and pharmacy
claims incurred in the subsequent year were the
source of outcomes for these patients. Population
risk management aims to reduce the cost of the tar-
geted population; although this may result in
improved health markers, the objective of risk
management is to improve financial outcomes for
the health plan. No reviews of patient medical
records, questionnaires, or special surveys are con-
ducted. Reliance on administrative data is an effi-
cient, low-cost approach that is ideal for population
risk management.
DATA PREPARATION
Before creating a prediction model, the demo-
graphic data and medical and pharmacy claims of
MCO patients were checked for completeness,
integrity, and consistency. Data preparation includ-
ed several activities:
• Adopting an adequate “run-out” period (in this
case 4 months), determined using standard actu-
arial methods for assessing the completeness of
incurred claims data.
MANAGERIAL
382 THE AMERICAN JOURNAL OF MANAGED CARE MAY 2003
Table 1. Distribution of Enrollees by Expense Categories and Percentage of Total Expenditures in 1998 and 1999
1998 1999
Medical Expense Average Average
Category Per Capita Total Total Per Capita Total
in 1998 Expenses, $ Enrollment, % Expenditures, % Expenses, $ Expenditures, %
Low (<$2000) 324 87 23 1191 58
Medium ($2000-$24 999) 5658 12 56 5385 35
High (≥$25 000) 49 032 1 21 15 800 7
• Separating medical expenses into categories such
as professional services, hospital inpatient servic-
es, hospital outpatient services, laboratory and
diagnostic tests, and pharmaceutical items. Data
checks were used to measure the data's internal
consistency against benchmarks. Data that were
rejected based on the diagnostic reports were
resubmitted and rechecked.
• Distinguishing the employee (policyholder) and
his or her dependents by assigning each enrollee
a unique member number. The claims for each
patient were collected, tabulated, and grouped
into a patient-centered database.
• Identifying “covered charges” that reflected only
those medical services that the MCO was obligat-
ed to pay.
• Using the amount the MCO paid for each claim
rather than the billed or charged amount or the
amount patients paid.
Members of the health plan are subject to differ-
ent plan designs, with variable copays, limits, exclu-
sions, and so on, as set by their employers. It can be
argued that a member’s behavior is influenced by
the specific design of the benefit; however, this is
one of many variables that we do not recognize in
our modeling. The potential for incorrectly assign-
ing a member (as high risk or not high risk) based
solely on plan design is considered to be minor.
OUTCOME MEASURES
The concepts of “patient risk” and “outcomes”
require clarification because these terms are often
unclear or imprecise. Outcomes is a broad term that
can have very different meanings for clinicians, epi-
demiologists, utilization management specialists,
risk managers, actuaries, and quality assurance pro-
fessionals. Outcomes may pertain to functional sta-
tus, patient satisfaction, mortality, hospital
utilization, or cost. Consistent with the objective of
population risk management, our sole focus was on
financial outcomes, as defined by total incurred
medical claims in a 1-year period.
USE OF A THRESHOLD COST FOR
“HIGH-RISK” PREDICTION—FOCUS ON
MEMBERS WITH BASE YEAR COST <$2000
Our analyses focus on a “≥$2000/<$2000” criteri-
on, which raises 2 issues:
• Why a threshold? Why not predict dollar cost?
• If we are going to recognize a threshold, why
$2000 and not another dollar amount?
A cutoff value of $2000 for annual patient-
incurred medical (including pharmacy) expenses
was used to establish a binary outcome variable, that
is, enrollees were either low-cost patients (<$2000)
or high-cost patients (≥$2000). A component of our
definition of patient risk, the probability that a
patient will incur paid medical expenses ≥$2000 in
the succeeding 1-year period, is based on a patient’s
characteristics in the base year. The base year in
this article is 1998, and the subsequent year is 1999.
The $2000 threshold was chosen for practical and
statistical reasons and to differentiate our approach
from the traditional approach used in disease man-
agement and case management. Disease manage-
ment–and case management–based methods for
identifying high-risk members rely on diagnoses
(using International Classification of Diseases, Ninth
Revision, Clinical Modification, codes) and events
(hospitalizations, emergency department visits, etc).
Frequently, members traditionally labeled as high risk
have base year costs ≥$2000 and often considerably
>$2000. Conversely, a population with costs <$2000
in the base year is not traditionally thought of as
being at high risk. Yet, our data and models show
that this population contains a considerable num-
ber of high-risk members. The models are general
and may be applied to any dollar threshold.
Because the subpopulation of low-cost consumers
is large, the targeting process will deliver a large
number of potential high-risk intervention candi-
dates who generate a significant percentage of a
health plan’s total expense. Conversely, if the focus
is on only the “repeaters” from the high-cost sub-
groups, relatively little total cost is identified for
intervention and management. From a business per-
spective, health plans are interested in identifying
subgroups whose management will have a noticeable
effect on the plan’s overall financial results.
Our definition of risk extends beyond cost. In our
work, we used a compound dependent variable
designed to capture 2 important risk factors: the
absolute amount of claims and the incidence or con-
centration of those claims. Thus, a patient whose
claims are more highly aggregated represents a high-
er risk than a patient who has the same absolute
amount of claims spread over 12 months. In general,
healthcare costs are transformed if they are to be
used as a dependent variable. As pointed out in the
recent Society of Actuaries study of risk adjusters,
1
large claims should be truncated. Owing to the distri-
bution, costs must be logarithmically transformed. In
a sense, the binary variable (≥/<$2000) is the sim-
plest transform. This form of transform in turn allows
Targeted Care for Low-Cost Members
VOL. 9, NO. 5 THE AMERICAN JOURNAL OF MANAGED CARE 383
us to include additional information (on the concen-
tration of claim amount) in the dependent variable.
There is a simple reason why categorical variables
are advantageous: raw dollar amounts are not uni-
formly calibrated from patient to patient. In other
words, 2 patients may have been subject to the same
medical services but may incur significantly different
costs due to differences in plan design, choice of
providers, or provider billing practices. Because sig-
nificant cost differences can be due to exogenous fac-
tors, it is preferable to use cost as a relative
approximation of risk rather than as an absolute
measure. In this sense, the $2000 threshold is mere-
ly a simple categorical variable.
The specific threshold of $2000 was chosen for 2
reasons. One is a practical consequence of the fact
that this model was used to select patients for an
intervention. In general, to show positive return on
investment in an intervention program, as well as to
simply have statistically measurable outcomes,
interventions require significant numbers of poten-
tial enrollees. Use of the $2000 threshold casts a
“wide net” for intervention targets. Second, because
higher costs are driven by events, it was deemed
advantageous to choose a threshold that correlated
strongly to the existence of an event as both a posi-
tive and negative proxy indicator. The $2000 thresh-
old fulfills this need.
PREVIOUS RESEARCH
Few articles have appeared in peer-reviewed jour-
nals that attempt to identify patients likely to incur
high medical expenses in a subsequent year among
patients who incurred modest medical expenses in
the preceding or base year. Meenan and his col-
leagues
2
at the Kaiser Permanente Center for Health
Services Research developed and tested 3 models to
identify high-cost risk status in a large sample of
approximately 100 000 HMO members from 3
health plans. LoBianco et al
3
studied high-cost
Medicaid users. Forman and his colleagues
4,5
and
Lynch et al
6
studied repeaters. We believe that our
analytical technique addresses an important under-
studied group: individuals with no obvious previous
costs who are less likely to be identified in the other
studies referred to previously.
The more common focus of research using
administrative data sets has been for the purpose of
risk adjustment.
7-10
The statistical methods, data
sources, and variables used for identifying high-cost
patients and creating risk adjusters are similar.
However, the goal of risk adjustment is not to iden-
tify individual patients with high-cost conditions or
to intervene in their care. The main objective of risk
adjustment is to accurately predict the average
annual expenditures for an individual patient to
redistribute premiums to health plans. Thus, the
coefficients or groupings models that researchers
have created for risk adjustment are not designed for
identifying high-cost patients, although these models
also use disease categories, comorbidities, and
demographic variables.
Targeting the Right Patients in
Population Risk Management
The study objective was to develop a prediction
model using variables in medical and pharmacy
claims data sets to identify patients with medical
expenses <$2000 in 1998 who were likely to incur
high medical expenses (≥$2000) in 1999. This study
was the first phase of a population risk manage-
ment project in which these targeted members
were randomly assigned to an intervention consist-
ing of a nurse-based, outbound-telephone survey
with 3 purposes:
• To identify gaps in care.
• To further stratify the population to identify “false
positives” whose diseases are well controlled.
• To help members become more compliant with
the prescribed treatment regimen.
Targeted members were assigned randomly to
intervention and control groups to evaluate the
interventions. The effectiveness of the interventions
is a subject for future articles.
Table 1, which was constructed from data before
the introduction of a targeted intervention program,
shows that a significant number of high-cost patients
in 1998 became low-cost patients in 1999.
Identification of high-risk members for interventions
is just one aspect of population risk management.
Equally important is the identification of high-cost
1998 members whose medical costs will decline
because they represent a group for whom the appli-
cation of resources should be limited.
Modeling for population risk management should
take into account a patient’s characteristics in addi-
tion to his or her base year expenses. On a larger
scale, claims-based prediction models should also
take into account plan composition. Significant dif-
ferences among MCO populations occur because of
plan-specific factors such as plan type (independent
practice association vs staff model, etc), capitation
MANAGERIAL
384 THE AMERICAN JOURNAL OF MANAGED CARE MAY 2003
agreements, copayment/deductible agreements,
Medicare/commercial mix, and level of pharmaceuti-
cal caps. Regional differences in medical expenses
may also reflect cultural differences, variation in clin-
ical practices, and availability of medical resources. If
an MCO has a large number of enrollees in a single
region, we found that it is preferable to develop a pre-
diction model for a single MCO rather than to pool
and then try to make statistical adjustments to
claims data from multiple heterogeneous health
plans. The brunt of the effort is in the process of
preparing data, not in making statistical calculations.
DEVELOPMENT OF THE
PREDICTION MODEL
The model was developed for a large HMO with an
average enrollment of approximately 350 000 persons.
The 209 000 patients studied had at least 1 claim in
1998 and 1999 and costs of <$2000 in 1998. The
base year was 1998, and 1999 was the subsequent
period in which financial outcomes were measured
from medical and pharmacy claims. The standard
split-sample technique (model developed on half of
the 1998 patients and tested and validated on the
other half) was used to prevent overfitting.
The prediction model was developed using multi-
ple regression model analysis. Several regression
models were calculated using dependent variables
such as the ≥$2000/<$2000 binary variable, various
transformations for the 1999 cost, and a proprietary
“cost grouper” that measures the degree of cost con-
centration in a given period. The final independent
patient variables included patient age, sex, number
of specific comorbid conditions, number of distinct
drug classes, number of physician visits, and non-
physician/nonhospital medical utilization. Binary
(0=absent and 1=present) variables were created
that flagged diabetes, cardiovascular, respiratory,
and psychiatric diseases.* Other independent vari-
ables that reflect behavioral factors (eg, the primary
treatment regimen for each disease state, the
patient’s prescription compliance pattern, and the
patient’s propensity to keep regular appointments
with physicians) were created through a transforma-
tion of each patient’s medical and pharmacy claims
in the base year and were available for inclusion in
the model. In the Appendix, we show an example of
the specific coefficients and variable for one model
used to develop predictions.
The relationships between input variables and
outcomes (eg, costs <$2000 annually/costs ≥$2000
annually) may not be linear. However, input vari-
ables can often be transformed so that the resulting
relationship between the aggregates of transformed
values of the independent variable and the depend-
ent variable is linear or nearly linear. For example
the relationship between the number of comorbid
conditions and the probability of annual costs being
≥$2000 is nearly linear (Figure 1).
RANKING ALL ENROLLEES, BASED ON
CLAIMS AND THE PREDICTION MODEL
By applying the regression coefficients to low-cost
patients’ individual characteristics, a numerical
score was calculated for every patient. The score
directly corresponds to the probability that the
patient will incur medical expenses ≥$2000 in the
subsequent year. For the purpose of designing an
intervention program, the patients who are targeted
for earliest deployment to health management and
nurse intervention are those with the highest scores
(ie, the patients with the highest probability of
incurring costs ≥$2000).
The probability of identified members experienc-
ing costs ≥$2000 declines as the number of identi-
fied members increases. This inverse relationship,
or “yield curve,” suggests that a population risk
management program should focus on high-risk
patients (Figure 2).
Targeted Care for Low-Cost Members
VOL. 9, NO. 5 THE AMERICAN JOURNAL OF MANAGED CARE 385
Figure 1. Relationship Between the Number of
Comorbid Conditions and the Probability of Annual
Costs Being ≥$2000
Comorbidities, No.
Medical Expenses ≥$2000, %
50
40
30
20
10
0
0123≥4
*International Classification of Diseases, Ninth Revision, Clinical
Modification, code values of the comorbid conditions or diagnostic
categories are available from the authors. The clinical logic that
determines the presence or absence of a specific condition [based
on pharmacy utilization patterns, demographic characteristics, lab-
oratory test results, and physician visit patterns] is proprietary.
VALIDATING THE TARGETING MODEL
Retrospective Validation: 1999 Claims Expenses
The prediction model was applied to approxi-
mately 209 000 members who were enrolled in 1998
and had claims of <$2000. The members were
ranked from high to low according to the probability
of incurring medical expenses ≥$2000 in 1999.
Results are given in Table 2 and illustrate more
clearly the relationship exhibited in Figure 2.
The inverse relationship between these 2 vari-
ables indicates that the higher the percentage of low-
cost 1998 enrollees targeted, the lower the
proportion of those targeted patients incurring med-
ical expenses ≥$2000 in 1999. For example, the
highest ranked 1054 low-cost 1998 patients, who
were 0.5% of the 1998 low-cost enrolled population,
had a 51.0% probability of incurring high costs,
approximately 3.6 times the average of the entire
low-cost population. Table 2 also gives (by rank) the
average claims costs incurred by targeted members
who had claims ≥$2000.
Prospective Validation: 2000 Claims Expenses
For the prospective test, the prediction model was
applied to low-cost 1999 patients. Members in 1999
and 2000 were ranked from high to low according to
the estimated probability of incurring costs ≥$2000
in 2000. Because this prediction was done as part of
an intervention program, the highest-ranking high-
risk patients were selected to receive nurse inter-
ventions. This group of 5535 members had a risk
ranking similar to that of the group we reported pre-
viously herein (risk rank ≥34). Eighty percent of
these targeted patients (n=4428) were randomly
selected to receive an intervention and, thus, were
inappropriate for prospective validation because
their behavior was subject to change by the inter-
vention. The remaining 20% of the high-risk patients
(n=1107) composed the control group and received
no intervention.
The subpopulation identified by the prediction
model as high risk, on average, was older, was more
likely to be male, and had more comorbid condi-
tions than the nontargeted population (Table 3).
The prevalences of asthma, diabetes, and conges-
tive heart failure for the high-risk population were
also significantly higher than those for the low-risk
population.
Table 4 displays results for the 1107 members of
the control group and of the remaining, untargeted
171 071 members who experienced claims <$2000
in 1999 but who were not targeted for intervention
because their risk scores were low (risk rank ≤33).
Pharmaceutical and medical claims incurred in
2000 were used to calculate the relative frequency of
patients with paid claims ≥$2000 and to compare
the average cost of the “control” high-risk patients
with that of the low-risk patients.
Table 4 provides the claims expenses for 2000 for
the targeted and nontargeted patients. Of the target-
ed, highest-ranked patients, 39.7% incurred high
(≥$2000) claims expenses in 2000 compared with 12.2%
of the nontargeted patients.
DISCUSSION
Although the prediction model was constructed
using data from one large HMO instead of pooled data
from other MCOs, the model can be generalized and
modified to fit other populations.
The key variables used in the regression model
(patient age, sex, number of chronic conditions,
number of distinct drug classes, number of physician
visits, nonphysician/nonhospital medical utilization,
and pre-sence or absence of diabetes, cardiovascular,
respiratory, and psychiatric diseases) have been test-
ed on several other large HMO data sets and different
periods. The coefficients differ somewhat because of
differences bet-ween plan-specific factors such as
plan type, physician in-centives, copayment/deductible,
and pharmacy benefit levels. However, the variables
used in the prediction model in the 209 000-mem-
ber study group are identical to those found in
HMOs with different patient and financial charac-
teristics. If 2 years’ data are available for a large
MCO, our preference is to construct MCO-specific
prediction models rather than to make adjustments
for the differences in plan characteristics.
MANAGERIAL
386 THE AMERICAN JOURNAL OF MANAGED CARE MAY 2003
Figure 2. Yield Curve Showing that the Probability
of Identified Members Experiencing Costs ≥$2000
Declines as the Number of Identified Members
Increases
Patients Targeted, %
Targeted Patients with
Cost ≥$2000 in 1999, %
60
50
40
30
20
10
0
020406080 100
Our analyses
shed new light on
regression to the
mean. We found
that very few
patients are con-
sistently high-cost
members (data not
shown). Of those
members who in-
curred catastroph-
ic costs in 1999
(≥$25 000), 39%
were in the 1998
low-cost category
and 43% came
from the previous
year’s medium-
cost segment. Only
18% of the 1998
high-cost category
members were
“repeat” high-cost
consumers in 1999.
In 2000, a very
small percentage of
the high-cost pa-
tients accounted
for ~20% of total expenditures; our data thus suggest
that expenditure levels in the base year are not a
good predictor of high costs in the subsequent year.
The transient nature of patients in MCOs is well
known in the industry, with turnover rates of 20% to
25% per year. Our prediction models required 2
years’ data for model construction and validation.
Disenrollment of >30 000 patients occurred in 2000.
Additional work is perhaps needed to study the char-
acteristics and utilization patterns of patients who
enroll and disenroll. Patients who are identified as
high-risk patients in the first year but then disenroll
before an intervention can be undertaken and meas-
ured will continue to complicate the evaluation of
population risk management programs.
In our analytical approach, it was not practical to
make any adjustments to reflect possible increases
in provider reimbursement rates, which were mod-
est in 1998-2000. We doubt that price adjustments,
which would require considerable effort, would
change the basic results of our research. One alter-
native worth consideration for future modeling is to
study expenditure patterns in the base and subse-
quent years by quintile or quartile.
The small price increases may have caused a few
patients to shift from low cost (<$2000) to high cost
Targeted Care for Low-Cost Members
VOL. 9, NO. 5 THE AMERICAN JOURNAL OF MANAGED CARE 387
Table 2. Patients With Claims in Both 1998 and 1999 and 1998 Costs <$2000
Cumulative Average 1999
Cumulative Probability Claims Cumulative Cost per
Rank Patients, No. ≥$2000 in 1999 Patients, % Costs in 1999, $ Patient, $
40 1054 51.0 0.5 3 899 475 7248
38 1713 50.2 0.8 5 963 129 6934
36 2668 48.7 1.3 9 041 030 6965
34 4019 47.8 1.9 13 106 399 6823
32 5741 45.6 2.7 17 510 712 6694
30 7952 43.7 3.8 23 255 465 6686
28 10 821 41.5 5.2 29 956 931 6678
26 14 487 39.0 6.9 37 678 296 6669
24 18 899 37.0 9.0 45 867 938 6558
22 24 125 35.3 11.5 55 632 741 6539
20 30 194 33.1 14.4 65 010 923 6507
18 37 474 31.2 17.9 75 654 386 6476
16 46 223 29.1 22.1 87 219 404 6474
14 56 369 27.2 27.0 98 565 035 6426
12 67 260 25.5 32.2 110 030 854 6405
10 79 390 24.0 38.0 121 234 656 6371
894913 22.0 45.4 132 390 434 6328
6 114 680 20.1 54.9 144 371 078 6276
4 139 547 18.0 66.7 156 633 721 6247
2 172 071 15.9 82.3 169 449 327 6184
0 209 069 14.2 100.0 183 062 265 6168
Table 3. Characteristics of High-Risk Patients With 1999 Costs <$2000
Disease Prevalence, %
Sex, %
Average Average Comorbid Congestive
Male Female Age, y Diseases, No. Diabetes Heart Failure
High risk, targeted, no intervention
(n = 1107) 58.9 41.1 58.7 2.9 59.3 2.0
All members 44.5 55.5 36.4 0.5 3.4 0.2
(≥$2000). Using different cutoff values ($2500,
$3000, etc) did not materially affect our results.
CONCLUSIONS
The prediction model is based on historical claims
data and was used to score each member’s risk for
incurring high medical expenses in the second year.
The prediction model successfully identified patients
with low medical expenses in 1998 who were 3.6
times as likely to incur high medical expenses in
1999 as the overall low-cost population. The predic-
tion model was tested prospectively on 1107 patients
who received no intervention and were identified as
likely to incur high medical expenses in 2000. The
1107 patients were more likely to incur costs ≥$2000
(39.7% vs 12.2% for the nontargeted group). Their
average costs were $6602 vs $1108.
The prediction model is only the first step in
developing cost-effective intervention programs.
Much hard work remains:
• Creating, evaluating, and implementing new
interventions;
• Adopting objective criteria to evaluate interven-
tions, which may eventually involve measuring
clinical outcomes and cost savings;
• Testing and adopting enhancements to the model,
including increasing the horizon beyond a single
year to identify members at risk for longer-term
events;
• Forecasting the financial impact of interventions
by including cost estimates of interventions and
estimated impact on medical expenses;
• Collecting data prospectively to assess the cost-
effectiveness of new and existing interventions;
• Providing tools for making better resource alloca-
tion, staffing, and intervention decisions;
• Finding ways to better identify and engage mem-
bers who are not compliant but whose behavior
may be changed by an intervention; and
• Developing tools to incorporate predictions into
pricing and underwriting strategies.
Population risk management depends on the
development of accurate prediction models in which
patients are selected for intervention according to
their predicted risk. The second component of pop-
ulation health management is to devise new inter-
ventions, that is, programs that change healthcare
delivery and hopefully improve patient outcomes.
A third and final step, in the absence of randomiza-
tion, is to use the prediction model to adjust
patients’ outcomes so that actual-to-expected out-
comes can be compared.
REFERENCES
1. Cumming RB, Knutson D, Cameron BA, Derrick BA. Compara-
tive Analysis of Claims-based Methods of Health Risk Assessment
for Commercial Populations. Schaumburg, Ill: Society of Actuaries;
2002.
2. Meenan RT, O'Keeffe-Rosetti C, Hornbrook MC, et al. The
sensitivity and specificity of forecasting high-cost users of medical
care. Med Care. 1999;37:815-823.
3. LoBianco MS, Mills ME, Moore HW. A model for case manage-
ment of high cost Medicaid users. Nurs Econ. 1996;14:303-
307, 314.
4. Forman SA. Breakthroughs in High Risk Population Health
Management. San Francisco, Calif: Jossey-Bass Publishers; 2000.
5. Forman SA, Kelliher M, Wood G. Clinical improvement with
bottom-line impact: custom care planning for patients with acute
and chronic illnesses in a managed care setting. Am J Manag Care.
1997;3:1039-1048.
6. Lynch JP, Forman SA, Graff S, Gunby MC. High-risk popula-
tion health management: achieving improved patient outcomes
and near-term financial results. Am J Manag Care. 2000;6:
781-791.
7. Ellis RP, Pope GC, Iezzoni L, et al. Diagnosis-based risk adjust-
ment for Medicare capitation payments. Health Care Financ Rev.
1996;17:101-128.
8. Kronick R, Dreyfus T, Lee L, Zhou Z. Diagnostic risk adjust-
ment for Medicaid: the disability payment system. Health Care
Financ Rev. 1996;17:7-33.
9. Pope GC, Ellis RP, Ash AS, et al. Principal inpatient diagnostic
cost group model for Medicare risk adjustment. Health Care
Financ Rev. 2000;21:93-118.
10. Weiner JP, Tucker AM, Collins AM, et al. The development of
a risk-adjusted capitation payment system: the Maryland Medicaid
model. J Ambulatory Care Manage. 1998;21:29-52.
MANAGERIAL
388 THE AMERICAN JOURNAL OF MANAGED CARE MAY 2003
Table 4. Year 2000 Outcomes of Patients With 1999 Cost <$2000
2000
1999 All Members Members With Claims ≥$2000
High risk, targeted with no intervention (n = 1107)
Average cost, % 1278 3176 6602
Costs >$2000, % 0 39.7 100.0
Low risk, not targeted (n = 171 107)
Average cost, $ 432 1108 6033
Costs >$2000, % 0 12.2 100
Targeted Care for Low-Cost Members
VOL. 9, NO. 5 THE AMERICAN JOURNAL OF MANAGED CARE 389
1. Specifics of the Prediction Model
All variables are transformed to be used in the model. Because this is a relative risk stratification, rather than
an absolute determination of cost, there is no need for an intercept variable.
Appendix
Independent Variable Coefficient
Diabetes (drug and diagnosis based) 0.069
Cardiac diagnosis (drug and diagnosis based) 0.039
Respiratory disease (drug and diagnosis based) 0.0039
Psychiatric diagnosis (drug and diagnosis based) 0.025
Physician visit variable 0.0029
Nonhospital, non–emergency department, nonphysician medical claim variable 0.012
Composite prescription drug variable: measure of prescription drug classes 0.013
Comorbidities (truncated at 4) 0.0076
2. Evaluation of models was through receiver oper-
ating characteristic (ROC) curves. The ROC curve
shown below for 1998-1999 continuous enrollment
has an area of 0.73.
3. Sample Demographic Statistics
Eligibility, 1998 Cost <$2000
Disease Classes: 1998 Cost <$2000
4. Cost Distribution (1998)
All 1998
100
80
60
40
20
0
0 20 40 60
1-Specificity, %
Sensitivity, %
80 100
Age Female Male All Gender
Unknown 10 968 10 622 21 590
<15 31 942 33 690 65 632
15-24 19 224 14 659 33 883
25-34 23 635 15 144 38 779
35-44 25 902 18 113 44 015
45-54 21 729 15 952 37 681
55-64 13 633 10 565 24 198
65-74 13 016 10 709 23 725
75-84 6173 4337 10 510
85+ 1709 808 2517
All Ages 167 931 134 599 302 530
Sex unknown 1647 All 304 177
Cumulative
Cost Group Cumulative % %
$ Claimants Claimants Claimants Claimants
0-999 269 489 269 489 77.3 77.3
1000-1999 34 688 304 177 9.9 87.2
2000-2999 14 450 318 627 4.1 91.4
3000-3999 7936 326 563 2.3 93.7
4000-4999 4931 331 494 1.4 95.1
5000-9999 9639 341 133 2.8 97.8
10 000-19 999 4634 345 767 1.3 99.2
20 000-24 999 904 346 671 0.3 99.4
25 000+ 2009 348 680 0.6 100.0
Congestive
Heart Cardiac Respiratory
Claimants Failure Condition Asthma Diabetes Condition
304 177 756 29 812 27 910 10 412 68 579
100.0% 0.2% 9.8% 9.2% 3.4% 22.5%