Development of Peer-Group-Classifica-
tionCriteria for the Comparison of Cost
Efficiency among GeneralHospitals
under theKorean NHI Program
Hee-Chung Kang,Jae-Seok Hong, and Heon-Jin Park
Objectives. To classify general hospitals into homogeneous systematic-risk groups in
orderto compare cost efficiency and propose peer-group-classification criteria.
Data Sources. Health care institution registration data and inpatient-episode-based
claims data submitted by the Korea National Health Insurance system to the Health
Insurance Review andAssessmentServicefrom July2007 toDecember2009.
Study Design. Cluster analysis was performed to classify general hospitals into peer
groups based on similarities in hospital characteristics, case mix complexity, and ser-
vice-distribution characteristics. Classification criteria reflecting clustering were devel-
oped. To test whether the new peer groups better adjusted for differences in systematic
risks among peer groups, we compared the R2statistics of the current and proposed
peer groups according to total variations in medical costs per episode and case mix
indices influencing the costefficiency.
Data Collection. A total of 1,236,471 inpatient episodes were constructed for 222
generalhospitals in 2008.
Principal Findings. New criteria were developed to classify general hospitals into
three peer groups (large general hospitals, small and medium general hospitals treating
severe cases, and small and medium general hospitals) according to size and case mix
Conclusions. This study provides information about using peer grouping to enhance
fairness in the performance assessment of health care providers.
Key Words. Cost efficiency, peergroup, cluster analysis
Policy makers in many health insurance systems experiencing rapidly escalat-
ing health care expenditures have focused on developing tools to hold physi-
cians accountable for their decisions(Bindman 1999; Austinet al. 2004).
Several strategies have been developed, including utilization review and
practice guidelines, to induce physicians to choose more rational, less expen-
sive behavior (Bindman 1999). Many insurance programs have employed uti-
lization review to foster more appropriate payment, but case-by-case
utilization review is time consuming, and a more efficient methodology is
required to complete the review of a large number of claims within the legally
Health Services Research
mandated time frame. In this context, profiling is considered to constitute a
more efficient mechanism for observing a physician's pattern of care (Bind-
man 1999; Romano 2004). Performance profiling involvescomparing anindi-
vidual physician's performance during a fixed period with normative
standards or with the average performance level of comparable physicians
during the same period (Bindman 1999; Smith 2000; Weiss and Wagner 2000;
Austin et al. 2004).However, sucha performance comparison between hospi-
tals invites dispute due to the difficulties of definingcomparable hospitals, par-
ticularly comparable general hospitals, which are multipurpose, multiproduct
institutions providingambulatorycareinaddition toinpatientservices (Stefos,
Lavallee, and Holden 1992; Sandy 1999; Pink et al. 2009). Contextual as well
as patient characteristics must be considered when classifying peer groups of
general hospitals to enable adjustment for the many systematic risks that are
noteasilymanageablebyhospitalsthemselves(Austinet al.2004;Byrneet al.
The objective of this study was to classify general hospitals into peer
groups so that differences in systematic risks can be adjusted and to pro-
pose peer-group-classification criteria for assigning newly established gen-
eral hospitals or those with changing characteristics to appropriate peer
The Ministry of Health and Welfare (MOHW) of Korea oversees the
National Health Insurance (NHI) Program. In terms of implementation,
the National Health Insurance Corporation (NHIC) functions as the insurer,
and the Health Insurance Review and Assessment Service (HIRA) conducts
reviews and assessments of medical costs. The HIRA review process is
designed to minimize the risk of payment for excessive or unnecessary patient
care in a fee-for-service-based reimbursement system (HIRA 2010).
Address correspondence to Hee-Chung Kang, Ph.D., Review & Assessment Research Division,
Health Insurance Review and Assessment Service, 1451-34, Peace BLDG 11F, Seocho3-Dong,
Seocho-Gu, Seoul, 137-927, Korea; e-mail: email@example.com. Jae-Seok Hong, Ph.D., is with
the Review & Assessment Research Division, Health Insurance Review and Assessment Service,
1720 HSR: Health Services Research 47:4 (August 2012)
Since 1989, benefits under Korea’s NHI program have been distributed
in two steps. The first step involves primary care institutions (clinics, hospi-
tals, general hospitals), and the second step involves the 44 tertiary general
hospitals (NHIC 2011). When an insured individual (dependent) requests
medical care at a tertiary general hospital, he or she must present a document
issued by the referring physician. Health care institutions are classified into
three groups by the number of beds they contain; clinics have fewer than 30
beds, hospitals have 30–99 beds, and general hospitals have more than 99
beds. The MOHW designates tertiary hospitals from general hospital appli-
cants on the basis of whether they meet the standards for teaching hospitals
and fulfill other criteria (NHIC 2007). Patients tend to be drawn to general
hospitals because large institutions generate greater trust. This phenomenon
has been identified as one of the contributors to increased medical costs (Lee
and Park 2010).
The HIRA has operated the Comprehensive Management for Appro-
priate Medical Services System (CM System) overseeing clinics since 2003.
The CM system is designed to encourage medical clinics showing extreme
patterns of practice and patient care to change voluntarily through feedback
providedby economic-performance profiles.
As medical costs continued to increase, the HIRA extended the imple-
mentation ofthe CM system tocoverhospitalsand,in July 2009,to cover gen-
eral hospitals and tertiary general hospitals. The cost efficiency of a general
hospital is measured relative to the average cost of an appropriate group of
peer hospitals, defined only by the number of beds, into hospitals with ?301
beds and those with ?300 beds. Administrators of general hospitals have
argued that a more refined classification of peer groups is required for fairer
comparisons amonggeneral hospitals.
Study Data and Measures
Data were obtained from NHI Claims Database and the NHI Institution Reg-
istration Data submitted to the HIRA by health care institutions. A total of
251 general hospitals except tertiary general hospitals provided health care
under the NHI Program for the 12 months from January to December 2008.
Their inpatient claims submitted to the HIRA from July 2007 to December
2009 were arranged according to admission date (from July 2007 to June
2009). Records were treated as belonging to the same episode if the time
between the discharge date and the following admission date was 2 days or
less and the second admission was associated with the same diagnosis-related
Peer-Group-Classification Criteria Development1721
group (DRG). Inpatient episodes that began in 2008 were selected from the
episode-based data for the analysis. Inpatient episodes are referred to as
patientsor episodes for simplicity. Additional costs were adjusted according to
the standard cost of the NHI fee schedule. Outlier episodes by DRG were
excluded, taking into account the effects of the small sample size, resulting in
the exclusion of fewer than 10 inpatient episodes per DRG per general hospi-
tal (Pope and Kautter 2007).
The ratio of main-disease episodes to total number of episodes exceeded
the criterion set by the MOHW for specialized hospitals in 29 general hospi-
tals. As of 2010, hospitals could be designated as specialized in nine domains
(joints, the cerebrovascular system, the colon, hands and feet, the cardiac sys-
tem, alcohol, breasts, spinal issues, and burns). To qualify as specializing in
spinal or alcohol-related diseases, a hospital must show that more than 66 per-
cent of its cases involve one of these areas; the comparable requirement for
the other domains is greater than 45 percent. These hospitals were excluded
from the analysis as they were regarded as specialized settings for certain dis-
eases, and their practice behavior differed from that of general hospitals. The
final sample included 222 general hospitals and 1,236,471 inpatient episodes.
The unitof analysiswasageneral hospital.
Peer groups, used to control for differences in systematic risks among
hospitals, were defined on the basis of hospital characteristics, case mix com-
plexity, and service-distribution characteristics, which have been studied as
influences on the finances or clinical outcomes of hospitals (Ellis and McGuire
1988; Stefos, Lavallee, and Holden 1992; Byrne et al. 2009). The use of peer
groups enabled us to categorize hospitals according to structural and patient
characteristics to facilitate comparisons among similar institutions (Byrne
Hospital characteristics consisted of type of ownership, teaching status,
location, number of specialties, number of beds, number of medical staff,
types of equipment, patient volume, and medical costs per inpatient episode.
Outpatient episodes were added to the inpatient episodes to calculate patient
volume. These variables have previously been used as hospital characteristics
in analyses of sources of variations among hospitals in case mix complexity
and performance indices (Ament, Kobrinski, and Wood 1981; Becker and
Steinwald 1981; Rosko and Carpenter 1993; Jian et al.2009).
Case mix complexity can be measured using scalar indices or informa-
tion-theory indices (Park and Shin 2004). The former approach reduces the
vector of the proportion of patients classified into diagnostic categories into a
single-value index through multiplication by vector weights, which often
1722 HSR: Health Services Research 47:4 (August 2012)
include the length of stay and costs or charges. The latter approach, first pro-
posed by Evans and Walker, measures differences in the proportions of two
sets of patients classified according to diagnostic categories (Klastorin and
Watts 1980; Park and Shin 2004). This study used both the Case Mix Index
(CMI) as a scalar index and the Professional Care Disease Index (PCDI) as an
The HIRA assigns an inpatient episode to a disease group using the Kor-
ean Diagnostic Related Groupings (KDRG) version 3.3, an inpatient case mix
classification system. The CMI is calculated with the following equation.
As the same average cost per DRG (Ci) is applied to the denominator (Ni) and
thenumerator(Nhi),the indexvaluewould be1ifthecasemixofageneralhos-
pital were equal to the average for its peer group. The DRG classification was
developed so that resource consumption, as well as the clinical characteristics
of the patients in a DRG, would be similar. Therefore, if the CMI of a general
hospital were 1.2, the general hospital would be regarded as having a case mix
spendingcostthat was 20 percent higherthan that of the peer-group average.
where h:a general hospital in the country,i: by DRG, Nhi: numberof inpatient
in the ith DRG in all general hospitals, Ci: average medical costs per inpatient
episode in the ith DRG in all general hospitals.
The PCDI is one of the accreditation criteria used by the MOHW to
designate tertiary general hospitals in the NHI program. Tertiary hospitals are
expected to treat severely ill patients who cannot be adequately diagnosed or
treated at the first stage of the NHI health care delivery system (National
Health Insurance Corporation (NHIC) 2011; Park and Shin 2004). The PCDI
is the proportion of patients in the adjacent DRGs (ADRGs) that were
hospitals, standardized by the proportion of these cases at the national level
(Park and Shin 2004). The ADRG is the first four digits of the DRG code.
In this study, we used this index as a second indicator to assess the case mix
complexity of the study institutions (general hospitals not designated as
tertiary institutions). Whereas the CMI represents the number of high-cost
patients within the total population of patients, the PCDI indicates the propor-
tion of all patients that pose a clinically high risk. Therefore, when the two
indices are considered together, the case mix complexity per general hospital
is more accurate.
Peer-Group-Classification Criteria Development1723
Professional Care Disease Index (PCDI)i
where H: set of all general hospitals in the nation; K: set of ADRGs that need
to be treated in tertiary hospitals; i: subscript for H; j: subscript for K; nij: num-
ber of inpatient episodes in the jth ADRG of the ith general hospital; Ni: total
numberof inpatient episodes in the ith general hospital.
Third, the service-distribution characteristics included the specialization
status and the distribution of patients by disease type. The specialization status
was measured using the Internal Herfindahl Index, a measure of the concen-
tration of services in a health care institution that is derived from the Herfin-
dahl-Hirschman Index and used to measure the diversity of procedures
performed at a single hospital or within a region (Wachtel et al. 2010). The
Internal Herfindahl Index (IHI) of i institution is calculated by the summation
of the Pj2values.
Internal Herfindahl Index (IHI)i¼
where Pj= proportion of all the patients in i general hospital accounted for by
the jth ADRG category.
If few services are provided in a hospital, the concentration of patients in
those services will be high (Lee and Chun 2008). In this study, the specializa-
tion status of the service was interpreted to be higher as the scope of inpatient
episodestreated by the generalhospital became narrower.
The proportion of the total number of inpatient episodes attributable to
each disease was used as an indicator of the distribution of patients by disease
type. This is one of the criteria used by the MOHW to designate specialized
Figure 1 presents the five steps involved in the analyses performed in this
study. The first step was to compare the distributions of current peer groups
(bed size: ?300, ?301) by their characteristics (t-test, general linear model
[GLM]). The second step was to classify general hospitals into peer groups
using cluster analysis.Cluster analysisis amultivariateprocedurethat simulta-
neously considers all classification variables to arrange a sample of entities
1724 HSR: Health Services Research 47:4 (August2012)
into distinct groups according to shared characteristics (Stefos, Lavallee, and
Holden 1992). For the cluster analysis, all variables were standardized using
PROC STANDARD, and nonhierarchical cluster analysis was conducted for
the appropriate number of clusters, as determined by hierarchical analysis.
PROC CLUSTER using Ward's method was used for hierarchical analysis,
and PROC FASTLCUS using the K-mean was used for nonhierarchical anal-
ysis. Cluster analysis was performed for all variables with the exception of cat-
egorical variables such as ownership, teaching status, and location. Variables
that were highly related according to correlation analyses were converted into
Figure 1: Study Hospitals and Analytical Process
Peer-Group-Classification Criteria Development 1725
aggregated variables for performing cluster analysis. These included number
of medical staff (number of physician specialists + number of nurses + num-
berofmedical technicians + numberof specialties);numberoftypesofequip-
ment (number of types of diagnostic test equipment + number of types of
radiological diagnostic and therapeutic equipment + number of types of
physical therapy equipment + number of types of surgical and treatment-
related equipment); and patient volume (number of inpatient episodes
+ number of outpatient episodes). Four cluster analyses were performed to
classify hospitals into three peer groups by removing the variable that most
weakly affected previous clustering. The first clustering was performed for
tion of patients by disease type. The second clustering was performed for the
first clustering variables except the distribution of patients by disease type,
which had most weakly affected the first cluster analysis. For the same reason,
the PCDI in third clustering and the IHI in the fourth clustering were addi-
The third step was to characterize each peer group by identifying the
common characteristics of the major classification factors shown in the itera-
tive decision-tree analysisfor the four cluster analyses.
The fourth step was to produce classification criteria through repeating
the cluster and decision-tree analyses with only the variables related to the
main factors identified in the prior step. To confirm that the classification crite-
ria reflected cluster analyses, the consistency between the final peer grouping
and eachprevious clustering was assessed with weightedkappastatistics.
The final step was to compare the R2statistics for the total variation in
medical costs per episode and the CMI, both of which influence cost effi-
ciency, to examine whether the proposed peer-grouping criteria better
adjusted for systematic risks when measuring cost efficiency than did the cur-
rent peer-grouping criteria (Lee 2007). SAS 9.1 and SAS Enterprise Minor were
used for this analysis.
Comparison of Characteristics betweenCurrent Peer Groups
When the current peer-group characteristics were compared, no significant
differences in ownership were evident (Table 1). A greater number of medical
schools were present in general hospitals with ?301 beds. General hospitals
with ?301 beds were also characterized by larger medical staffs, more
1726 HSR: Health Services Research 47:4 (August 2012)
Table 1: Descriptive Statistics of Study Hospitals
No. of Beds ? 300
(N = 120)
No.of Beds ? 301
(N = 102)
19.5 ± 10.6
73.8 ± 36.3
18.7 ± 7.8
11.7 ± 2.7
210.6 ± 55.4
64.1 ± 41.1
206.2 ± 103.5
52.2 ± 28.2
19.1 ± 3.3
494.5 ± 136.1
31.1 ± 7.4
11.1 ± 2.3
14.2 ± 3.2
8.5 ± 2.5
45.2 ± 1.5
16.1 ± 3.4
20.4 ± 4.1
12.7 ± 3.4
2,441 ± 1,604
145 ± 72
9,249 ± 7,676
433 ± 285
9.5 ± 2.613.5 ± 3.4
0.65 ± 0.17
0.08 ± 0.19
0.86 ± 0.22
0.49 ± 0.56
0.19 ± 0.050.14 ± 0.048.2**
8.0 ± 5.0
3.0 ± 5.2
2.5 ± 3.3
17.5 ± 9.9
21.3 ± 8.4
21.9 ± 15.2
2.3 ± 6.3
9.9 ± 5.5
6.3 ± 5.5
2.7 ± 2.3
11.9 ± 6.8
19.2 ± 6.5
22.8 ± 11.5
4.9 ± 4.9
*p < .05,
**p < .001.
Peer-Group-Classification Criteria Development1727
specialties, and a greater number of beds. The CMI was higher in the general
hospitals with ?301 beds than in those with ?300 beds, and the PCDI was
higherin general hospitalswith ?301 beds.
Regarding service distribution, the specialization status was higher in
general hospitals with ?300 beds. With respect to patient proportions by dis-
ease type, no significant differences in general surgical and pediatric patients
were evident, but lower proportions of joint, spinal, and colon patients were
admitted togeneral hospitals with ?301 beds.
The distributions of most of the characteristics assessed were wider in
general hospitals with ?301 beds than in general hospitals with ?300 beds.
The distribution of the two groups was not distinctive and only partially over-
Clustering and Cluster Characterization
Three clusters emerged from cluster and decision-tree analyses. Cluster 1
included small and medium-sized general hospitals; cluster 2 included large
general hospitals; and cluster 3 included small and medium-sized general
hospitals that treated severe cases (Figure 2). Hospitals in cluster 1 had
fewer specialists and specialties and lower proportions of pediatric patients
or CMIs. Cluster 2 hospitals had more specialists and specialties and
higher proportions of pediatric patients or CMIs. Hospitals in cluster 3 had
fewer specialists and specialties, but higher proportions of pediatric patients
The first classification variable in the decision tree for the first clus-
tering was the number of physician specialists, and the second classifica-
tion variable, which divided the group with fewer than 41.5 physician
specialists into two subgroups, was the proportion of pediatric patients;
the second classification variable in the second, third, and fourth decision
trees was the CMI.
The CMI of general hospitals, rather than the proportion of patients
with each disease, was used to further classify hospitals because the CMI
includesameasure of the complexity of the case mix of all patients.
When we compared the classification consistency using weighted kappa
statistics, the level of consistency of the first clustering (0.3–0.4), which
included the proportions of patients by disease type, was low with each subse-
quent clustering. The second, third, and fourth clustering showed high levels
of consistency with one another (0.7–0.8), with the exception of the analysis of
the proportion of patientsby disease.
1728HSR: Health Services Research 47:4 (August 2012)
Development of the Classification Criteria Used to Define PeerGroups
Following the characterization of clusters, we used the size of the hospital as
the first factor for defining peer groups and its case mix complexity as the sec-
The new classificationcriteria were developedin two steps. The firststep
involved a cluster analysis to classify general hospitals into two groups using
size-related variables (i.e., number of specialists, nurses, beds, and specialties),
Characterization of Each Cluster Using Cluster and Decision-Tree
Peer-Group-Classification Criteria Development1729
which were converted to a new variable, size. The correlation coefficient
among the four size-related variables was as high as 0.8, and thus all variables
were converted into one principal component value to perform the cluster
analysis. In the principal component analysis, Prin1 explained 85 percent of
the total variance, and the principal component score for Prin1 was identical
in all variables. Therefore, a size variable was produced to standardize and
summarize the four variables with the same weight. The decision tree by Prin1
and that by the new variable of size producedidentical results.
We conducted cluster and decision-tree analyses using size to determine
the point at whichto divide groups. The reference point for size was 8.2, which
divided general hospitals into 78 large general hospitals and 144 small and
medium-sized general hospitals.
Sizei¼ Npi=rpþ Nni=rnþ Nbi=rbþ Nsi=rs;
where Npi: number of physician specialist at general hospital i, Nni: number of
nurse at general hospital i, Nbi: number of bed at general hospital i, Nsi: num-
ber of specialty at general hospital i, rp, rn, rb, rs: standard deviation of Npi,
The second step involved classifying general hospitals smaller than a
certain size (8.2) into two groups using the CMI. The distribution of the CMIs
in the small and medium-sized general hospitals (n = 144) was positively
skewed. The frequency of the distribution dramatically decreased at CMI >
0.7, revealing outliers among the hospitals. Thus, we divided the small and
medium-sized general hospitals into two groups at the 75th percentile
(CMI = 0.73).
When we assessed the consistency of the groupings according to size
and CMI with that of all but the first previous clustering results, including the
proportion of pediatric patients, the weighted Kappa statistics were 0.77, 0.74,
and0.63forthe second, third,and fourth clusterings,respectively.The highest
level of classification consistency was for the second clustering, which
tion criteria reflected the previous clustering.
Comparison of Characteristics amongProposed Peer Groups
Finally, the proposed new peer groups were compared in Table 2. Cluster 1,
small and medium-sized general hospitals, had the lowest number of beds and
specialists and the lowest CMI. Cluster 2, large general hospitals, had larger
staffs, greater numbers of specialists and beds, and the highest CMIs. Cluster
1730 HSR:Health Services Research 47:4 (August2012)
Table 2: Comparison of Characteristics among Proposed Peer Groups
(N = 78)
Small and Medium
SevereCases(N = 36)
(N = 108)X2/F
77.6 ± 38.618.5 ± 7.222.9 ± 8.1152.7***
240.2 ± 94.9
60.6 ± 26.9
72.7 ± 31.8
18.6 ± 7.0
91.6 ± 38.9
23.1 ± 9.9
20.6 ± 2.113.1 ± 2.611.8 ± 2.6300.8***
527.1 ± 136.3272.9 ± 78.6229.3 ± 87.3187.5***
48.9 ± 7.3
17.2 ± 3.0
30.6 ± 6.8
11.2 ± 2.3
33.9 ± 7.1
11.8 ± 2.0
21.6 ± 3.8 14.4 ± 3.215.5 ± 3.2104.4***
13.9 ± 2.8 8.2 ± 2.4 9.3 ± 1.9127.5***
11.2 ± 7.62.5 ± 1.62.6 ± 2.186.1***
517.3 ± 271.3144.1 ± 68.7156.6 ± 90.5117.6***
3, small- and medium-sized hospitals treating severe cases, was similar to the
small- and medium-sized group in terms of number of specialists and beds,
but it was higher than that group in terms of CMIs (Table 2).
(N = 78)
Small and Medium
SevereCases(N = 36)
(N = 108)X2/F
14.1 ± 3.3 13.5 ± 2.9 8.6 ± 1.4 130.0***
0.90 ± 0.20
0.63 ± 0.58
0.90 ± 0.18
0.16 ± 0.25
0.58 ± 0.09
0.04 ± 0.12
0.12 ± 0.03 0.19 ± .050.19 ± 0.0561.7***
9.6 ± 5.0
6.9 ± 5.1
3.1 ± 2.3
7.6 ± 4.6
2.4 ± 4.8
2.5 ± 3.4
11.0 ± 6.9
5.5 ± 6.4
1.8 ± 1.6
10.6 ± 5.5
18.5 ± 6.0
22.7 ± 10.4
6.3 ± 4.9
16.3 ± 9.3
21.8 ± 8.2
24.3 ± 15.1
2.5 ± 6.6
19.9 ± 10.3
20.1 ± 8.5
15.3 ± 13.2
0.3 ± 0.9
*p < .05,
**p < .001,
***p < .0001.
CMIs between Proposed and Current Peer Groups (Unit: %)
R2* Statistics: Comparison of Medical Costs Per Episode and
N = 222ProposedPeerGroups Current Peer Groups
1732HSR: Health Services Research 47:4 (August2012)
The R2for the medical costs per episode and the CMI for the proposed
peer-group classification were 65.4 and 56.6 percent, respectively, both of
which were higher than those (51.1 and 42.2 percent) for the current peer-
group classification (Table 3).
This study presents criteria to classify general hospitals into three peer groups
(group 1: large-sized general hospitals [size ? 8.2], group 2: small- and med-
ium-sized general hospitals with severe cases [size < 8.2, CMI ? 0.73],
group 3: small- and medium-sized general hospitals [size < 8.2, CMI < 0.73]).
According to consistency testing, the proposed classification criteria reflected
the results of the cluster analysis well.The R2statistics for the variation in costs
per episode and the CMI, both of which influence the efficiency index, were
shown to increase substantially in the peer groups classified according to the
proposed grouping criteria (R2: 65.4 and 56.6 for cost per episode and CMI,
respectively) compared with those in the groups classified according to the
current groupingcriteria,whichuse numberof beds (R2: 51.1, 42.2 for cost per
episode and CMI, respectively). Indeed, when the R2of the effect of peer-
group classification on total variation was higher, the variation within the
group decreased, whereasthat amongthe groups increased.
Structural differences in staff, facilities, equipment, and case mix can
to differences in the patient-care costs and therefore in cost efficiency (Ellis
and McGuire 1988).
The issue of fairness in comparisons of the performance of hospitals was
explored by Jencks et al. (1984) and Ellis and McGuire (1988). They demon-
strated that the systematic risks, which health care providers cannot manage
independently, should be taken into account when comparing performance.
Therefore, when the definition of a peer group incorporates adjustments for
systematic risks, the fairness of comparisons is enhanced.
To adjust for differences among hospitals in systematic risks, Stefos, Lav-
allee, and Holden (1992) defined peer groups using cluster analysis so that the
various characteristics of the hospitals in a group tended to be similar. Zodet
and Clark (1996) classified hospitals in Michigan into 13 groups that displayed
similar characteristics. A recent study suggested a new method for measuring
the similarity of the characteristics of hospitals using Euclidean distance to
define peer groups ofpublichealth care institutions(Byrne et al. 2009).
Peer-Group-Classification Criteria Development1733
These studies addressed classification methods using cluster analysis,
but they did not present a method for developing classification criteria that
reflect the results of clustering, which health insurance administrators could
use to assign a newly established general hospital or an established hospital
whose characteristics havechangedtoapeergroupwithin areasonableperiod
of time. The current study could therefore increase the application of cluster
This study also reflects all the factors leading to the variations among the
hospitals examined in previous studies in terms of cost efficiency, including
human resources, facilities, equipment, case mix per disease, CMI, and spe-
cialization index (Ament, Kobrinski, and Wood 1981; Becker and Steinwald
1981; Rosko and Carpenter1993).
When the performance of a health care provider is compared with the
average performance of the appropriate peer group, the definition of the peer
group may also influence the accuracy of the cost-efficiency index. The Medi-
care Resource Use Reports Plan also includes the selection of peer groups as
among the main tasks to be completed prior to the calculation of the efficiency
index (MaCurdyet al. 2008).
Many health care systems encourage health care facilities that show
extreme practice patterns to reach the peer-group-specific benchmark. How-
ever, the definition of comparable hospitals has always proven controversial
(Stefos, Lavallee, and Holden 1992; Pink et al. 2009). The appropriate
benchmark may vary depending on the interests of the stakeholders. Health
care institutions would prefer to be compared within peer groups whose
members possess similar characteristics. In contrast, health care consumers
would ask for a comparison of performance with the nationwide average, as
this information can be used to select health care institutions from which ser-
vices can be purchased (Austin et al. 2004; Romano 2004). Health insurance
administrators must consider both interests in determining the appropriate
benchmark level if the comparison is to be meaningful enough to promote
Pink et al. (2009) discussed a peer-group effect whereby the propor-
tion of health care providers meeting the benchmark varies by peer group,
indicating that the ability to reach a certain level differs among peer groups.
It was therefore claimed that a benchmark should represent a high level of
financial performance regardless of the factors that influence the ability of a
hospital to reach the benchmark. That is, if hospitals were compared only
with others in their peer group, the overall efficiency of health care institu-
tions may decrease further. However, it is difficult to ignore the assertion
1734HSR: Health Services Research 47:4 (August2012)
that hospitals have immutable systematic characteristics and that these char-
acteristics may have a negative effect on both financial performance and
quality of care (Ellis and McGuire 1988; Austin et al. 2004; Romano 2004;
Byrne et al. 2009).
Potential users of our methodology may ask which variables should be
part of a risk-adjustment system and which should be used to define a peer
group. We would respond that peer grouping constitutes an additional option
for determining the benchmark used for comparisons because hospitals out-
side the peer group would not be included in the comparative analysis. In this
study, we included each hospital's CMI, a measure of hospital-wide case com-
plexity,asa variable for classifying hospitals into peer groups.
In Korea, the HIRA regularly defines peer groups in terms of number of
beds and compares the risk-adjusted performance (using DRGs) of hospitals
This study emphasized the need to adjust for a hospital's patient level of
risk when comparing performance and further suggested the desirability of
including a hospital-wide case-complexity measure in the definition of peer
The present study may have some limitations. First, the adjustment for
systematic risksamongthe peergroups may havebeenbeinadequate.Indeed,
the first clustering of the cluster analysis, in which the proportion of patients
by disease type was considered, differed from the other three clusterings. This
indicates that the proposed peer-group classification cannot consider all fea-
tures of a hospital. However, because specialized hospitals were excluded
from this study, we determined that additional considerations regarding the
come such limitations through focusing the scope of the evaluation on specific
disease types are required.Inaddition, it is importantto re-evaluate the appro-
priatenessof peer-group assignments and to redefine peer groupsperiodically.
Second, determining the appropriate number of clusters is difficult; no
statistical tests currently exist to confirm that group numbers are optimal. In
this study, a two-stage cluster analysis was performed (Stefos, Lavallee, and
Holden 1992). As a first stage, Ward's hierarchical method was used to select
the appropriate number of groups. The changes in R2and psudo-t2values
were analyzed according to changes in the number of clusters, and the results
suggested two to four clusters. We set three groups as the optimum number of
clusters, as the number of groups should be minimized to the extent that it sat-
isfies both statistical accuracy, after adjusting for systematic risk, and adminis-
Peer-Group-Classification Criteria Development1735
Third, as no information regarding patient addresses was available in
theHIRAclaimsdata, wecouldnotincludethecompetition levelofthehospi-
tal location as a variable. When discussing the causes of variation in health
care utilization by location, the medical supply and market aspects that cause
inappropriate and excessive utilization of health care services have been of
interest (Goodman and Green 1996; Do 2007). However, the CMI may
already reflect the structural or external circumstances of a general hospital,
and thus the exclusion of such variables would not constitute a major bias in
the study results(Becker and Steinwald 1981).
Continued efforts to improve the fairness and accuracy of peer-group
comparisons are required to increase the receptivity of health care providers
to performance profiling. In this context, this study is expected to provide use-
ful information for defining and managing peer groups in the assessment of
health care performance.
This study demonstrated the procedure for performing cluster analyses of the
characteristics of general hospitals that provide health care under the Korea
NHI program so that the systematic risks within a peer group are similar and
peer-group-classification criteria can be developed. Our results suggest that
cluster analysis may be a useful method for classifying multipurpose, multi-
product hospitals into peer groups because it is a multivariate procedure that
simultaneously considers all variables affecting performance. This study con-
tributes to increasing the fairness and accuracy of performance comparisons
amonghealthcareproviders andalso expandstheutilizationofstatisticalanal-
ysis through the development of classification criteria reflecting the results of
Joint Acknowledgment/Disclosure Statement: The authors thank the Health Insur-
ance Review and Assessment Service (HIRA) of Korea. This study was con-
ducted using data from the Korean National Health Insurance Claims
Databaseof the HIRA.
1736 HSR:Health Services Research 47:4 (August 2012)
Ament, R. P., E. J. Kobrinski, and W. R. Wood. 1981. “Case Mix Complexity Differ-
ences between Teachingand Nonteaching Hospitals.”Journal ofMedical Education
56 (11): 894–903.
Austin,P. C., D. A. Alter, G. M. Anderson, and J. V.Tu. 2004. “Impact of the Choice of
Benchmark on the Conclusions of Hospital Report Cards.” American Heart Jour-
Becker, E. R., and B. Steinwald. 1981. “Determinants of Hospital Case Mix Complex-
ity.” HealthService Research 16 (4): 439–58.
Bindman, A. B. 1999. “Can Physician Profiles Be Trusted?” Journal of American Medical
Association 281 (22): 2142–3.
Byrne, M. M., C. N. Daw, H. A. Nelson, T. H. Urech, K. Pietz, and L. A. Petersen.
2009. “Method to Develop Health Care Peer Groups for Quality and Financial
Comparisons across Hospitals.”Health Services Research 44 (2): 577–92.
Do, Y. K. 2007. “Research on Geographic Variations in Health Services Utilization in
the United States: A Critical Review and Implication.” Korean Journal of Health
Policy andAdministration 17 (1): 91–124.
Ellis, R. P., and T. G. McGuire. 1988. “Insurance Principles and the Design of a Pro-
spective PaymentSystem.”Journal of Health Economics 7 (3): 215–37.
Goodman, D. C., and G. R. Green. 1996. “Assessment Tools: Small Area Analysis.”
American Journal ofMedical Quality 11 (1): s12–4.
Health Insurance Review and Assessment Service (HIRA). 2010. “Major Activities of
HIRA: Review” [accessed on August 8, 2010]. Available at http://www.hira.or.
Jencks, S. F., A. Dobson, P. Willis, and P. H. Feinstein. 1984. “Evaluating and Improv-
ing the Measurement of Hospital Case Mix.” Health Care Financing Review
(AnnualSupplement, November): 1–11.
Jian, W., Y. Huang, M. Hu, and X. Zhang. 2009. “Performance Evaluation of Inpatient
Service in Beijing: A Horizontal Comparison with Risk Adjustment Based on
Diagnosis Related Groups.” BMC Health Services Research 9 (72): DOI: 10.1186/
Lee, K. H. 2007. “The Effects of Case Mix on Hospital Costs and Revenues for Medi-
care Patients in California.”Journal ofMed Systems31 (4): 254–62.
Lee, S. J., and J. Y. Park. 2010. “Changing Trends in Daegu and Gyeongbuk-Based
Patients’ Use of Health Facilities in Seoul.” Korean Journal of Health Policy and
Administration20 (4): 19–44.
MaCurdy, T., N. Theobald, J. Kerwin, and K. Ueda. 2008. “Prototype Medicare Utili-
able at https://www.cms.gov/reports/downloads/MaCurdy2.pdf.
National Health Insurance Corporation (NHIC). 2007. National Health Insurance Pro-
Peer-Group-Classification Criteria Development1737
National Health Insurance Corporation (NHIC). 2011. “NHI Program” [accessed on Download full-text
May 10, 2011]. Available at http://www.nhic.or.kr/english/insurance/insurance01.
Park, H., and Y.Shin. 2004. “MeasuringCase-Mix Complexity ofTertiary Care Hospi-
tals UsingDRGs.”HealthcareManagement Science 7 (1): 51–61.
Pink, G. H., G. M. Holmes, R. T. Slifkin, and R. E. Thompson. 2009. “Developing
Financial Benchmarks for Critical Access Hospitals.” Healthcare Financing Review
30 (3): 55–69.
zationsin Medicare.” HealthcareFinancing Review29 (1): 31–43.
Romano, P. S. 2004. “Peer Group Benchmarks Are Not Appropriate for Healthcare
Quality Report Cards.”AmericanHeart Journal 148 (6): 921–3.
Rosko, M. D., and C. E. Carpenter. 1993. “Development of a Scalar Hospital-Specific
Severity ofIllness Measure.”Journal of Medical Systems17 (1): 25–36.
Sandy, L. G.1999. “The Future of Physician Profiling.” Journal of Ambulatory Care Man-
agement 22 (3): 11–6.
Smith, W. R. 2000. “Evidence for the Effectiveness of Techniques to Change Physician
Behavior.”Chest 118 (2 suppl): 8s–17s.
Stefos, T., N. Lavallee, and F. Holden.1992. “Fairness in Prospective Payment: A Clus-
teringApproach.” HealthServices Research 28 (2): 239–61.
Wachtel, R. E., F. Dexter, B. Barry, and C. Applegeet. 2010. “Use of State Discharge
Abstract Data to Identify Hospitals Performing Similar Types of Operative Pro-
cedures.”Economics, Education, and Policy 110 (4): 1146–54.
Weiss, K. B., and R. Wagner. 2000. “Performance Measurement through Audit, Feed-
back, and Profiling as Tools for Improving Clinical Care.” Chest 118 (2 suppl):
Zodet, M. W., and J. D. Clark. 1996. “Creation of Hospital Peer Groups.” Clinical Per-
formance and Quality Health Care4 (1): 51–7.
Additional supporting information may be found in the online version of this
Please note: Wiley-Blackwell is not responsible for the content or func-
tionality of any supporting materials supplied by the authors. Any queries
(other than missing material) should be directed to the corresponding author
for the article.
1738 HSR:Health Services Research 47:4 (August2012)