Journal of Rehabilitation Research & Development
Volume 47, Number 8, 2010
Mental illness-related disparities in length of stay: Algorithm choice
Susan M. Frayne, MD, MPH;1–3* Eric Berg, MS;1–2 Tyson H. Holmes, PhD;4 Kaajal Laungani, BA;1 Dan R.
Berlowitz, MD, MPH;5 Donald R. Miller, ScD;5 Leonard Pogach, MD, MBA;6 Valerie W. Jackson, MPH;1–2
Rudolf Moos, PhD1,3–4
1Center for Health Care Evaluation, Department of Veterans Affairs (VA) Palo Alto Health Care System, Palo Alto, CA;
2Division of General Internal Medicine, Stanford University, Palo Alto, CA; 3Center for Primary Care and Outcomes
Research, Stanford University, Palo Alto, CA; 4Department of Psychiatry and Behavioral Sciences, Stanford University,
Palo Alto, CA; 5Center for Health Quality, Outcomes, and Economic Research, Edith Nourse Rogers Memorial Veterans
Hospital, Bedford, MA; and Boston University School of Public Health, Boston, MA; 6Center for Healthcare Knowledge
Management, VA New Jersey Health Care System, East Orange, NJ; and University of Medicine and Dentistry of New
Jersey-New Jersey Medical School, Newark, NJ
Abstract—Methodological challenges arise when one uses vari-
ous Veterans Health Administration (VHA) data sources, each
created for distinct purposes, to characterize length of stay (LOS).
To illustrate this issue, we examined how algorithm choice affects
conclusions about mental health condition (MHC)-related differ-
ences in LOS for VHA patients with diabetes nationally (n =
784,321). We assembled a record-level database of all fiscal year
(FY) 2003 inpatient care. In 10 steps, we sequentially added
instances of inpatient care from various VHA sources. We pro-
cessed databases in three stages, truncating stays at the beginning
and end of FY03 and consolidating overlapping stays. For
patients with MHCs versus those without MHCs, mean LOS was
17.7 versus 13.6 days, respectively (p < 0.001), for the crudest
algorithm and 37.2 versus 21.7 days, respectively (p < 0.001), for
the most refined algorithm. Researchers can improve the quality
of data applied to VHA systems redesign by applying method-
ological considerations raised by this study to inform LOS algo-
Key words: algorithms, databases, Department of Veterans
Affairs, episode of care, healthcare disparities, health services
research, human, length of stay, mental disorders, outcome and
process assessment, patient discharge, physician’s practice pat-
terns, rehabilitation, reproducibility of results, veterans, veter-
Health services researchers often use administrative
data for characterizing length of stay (LOS) to address a
range of objectives. For example, they may examine how
LOS (as a dependent variable) varies as a function of
patient characteristics (e.g., age, race, insurance status,
presence of comorbidity), processes of care (e.g., speed of
emergency department response, types of medications
administered or interventions applied, discharge proto-
cols, etc.), or institutional characteristics (e.g., teaching
hospital, mental health facility, etc.) [1–7]. Alternatively,
Abbreviations: DEpiC = Diabetes Epidemiology Cohort, DSS =
Decision Support System, EXT = extended care, FY = fiscal year,
ICD-9 = International Classification of Diseases-9th Revision,
LOS = length of stay, MHC = mental health condition, OBS =
observation, OPAT = outpatient file, VHA = Veterans Health
*Address all corr espondence to Susan M. Fr
MPH; Center fo r Health Ca re Eva luation, 79 5 Willow
Road (152-MPD), Palo Alto, CA 940 25; 650-493-5000, ext
23369; fax: 650-617-2690. Email: email@example.com
ayne, M D,
JRRD, Volume 47, Number 8, 2010
they may examine LOS as a potential explanatory vari-
able for predicting other outcomes  or they may restrict
their cohort to patients meeting specific LOS criteria .
Furthermore, accurate identification of intervals of inpa-
tient care is required for studies using an episodes-of-care
The concept of LOS is simple: time from admission to
discharge. However, a number of methodological consider-
ations arise when Veterans Health Administration (VHA)
data are used for calculating LOS. First, goals of the project
must be carefully considered, because this will influence the
algorithm selected. Is the focus on acute or long-term care,
on medical-surgical or mental health stays? Is the objective
to examine total LOS across multiple years or LOS during a
particular interval of study? Second, the algorithm must
account for technical, data-quality issues. These include
duplicate records, overlapping or sequential inpatient stays,
transfers between different inpatient units, and inpatient
stays that are recorded in a subsequent year.
Despite that numerous studies focus on LOS, these
subtleties of LOS calculation have received little attention.
This oversight could have serious implications: algorithm
choice can influence conclusions in health services studies
[11–13], although to our knowledge this possibility has not
been studied in the specific case of LOS. As VHA leader-
ship increasingly seeks to obtain accurate estimates of
healthcare costs and use evidence to guide strategic plan-
ning decisions, it is critical that the evidence base support-
ing those decisions be as accurate as possible.
One example of a clinical scenario wherein LOS algo-
rithm choice could influence conclusions is mental health
condition (MHC)-related differences in inpatient care use.
Prior studies both within and outside the VHA have docu-
mented that, compared with patients without MHCs,
patients with MHCs tend to use more inpatient care [6,14–
19]. Thus, patients with MHCs represent a particularly
high-intensity, high-cost group likely to merit special
attention by VHA policy makers. However, some charac-
teristics of the way patients with MHC receive inpatient
care may make their VHA records disproportionately sus-
ceptible to variation in algorithm choice. For example,
patients with MHC might be more likely to experience
more complex patterns of inpatient care (e.g., transferring
between a medical unit and a psychiatric unit during the
course of a single hospitalization episode), or to receive
care in extended-care settings, where stays can be long and
can span multiple fiscal years (FYs). Such factors could
potentially influence LOS calculations differently for
patients with MHC versus those without MHC.
We used VHA administrative data to examine how
application of incrementally more refined algorithms for
calculating LOS during 1 year of care affected conclu-
sions about mean LOS in a national cohort of VHA
patients with diabetes. Then, as an illustrative example of
the practical implications of such methodological deci-
sions, we examined whether the magnitude of observed
mental illness-related disparities in mean LOS varied as a
function of LOS algorithm applied.
This work is part of a larger study examining the effect
of MHC on processes of outpatient diabetes care in FY03.
Because the focus of that study is on outpatient care, we
wished to identify (and ultimately exclude from the larger
study) patients who were institutionalized (i.e., on inpa-
tient status) for the majority of FY03. Therefore, our goal
was to identify, for each patient in our cohort, all days in
FY03 during which the patient was on inpatient status
(acute care or extended care). We were not seeking to char-
acterize total LOS for the patients in our cohort (which
could have spanned multiple years), but only those inpa-
tient days that occurred during FY03. The process of creat-
ing our LOS variable and the effect of algorithm choice on
conclusions about MHC-related differences in LOS is the
focus of the present study.
The cohort was drawn from the FY02 Diabetes Epi-
demiology Cohort (DEpiC), a census of patients with
diabetes in VHA nationally. DEpiC is used extensively
for VHA epidemiological and health services research
. DEpiC identifies patients with diabetes based on the
presence of at least one instance of an antiglycemic pre-
scription or at least two instances of a diabetes Interna-
tional Classification of Diseases-9th Revision (ICD-9)
code in inpatient or outpatient records. Among the
911,451 FY02 DEpiC members who were veterans, used
VHA outpatient care at least once in FY02, and were alive
as of the first day of FY03, we selected the 784,321 whose
MHC status could be verified, as described next (in sub-
sidiary analyses, we included the full 911,451 subjects,
including those with “MHC Possible” status).
FRAYNE et al. Algorithm choice for length of stay
Steps to Assemble Raw Record-Level Database of
We started by creating a record-level file containing
every instance of inpatient care recorded in any inpatient
database available in centralized VHA files. We selected
only records that contained at least 1 day of inpatient care
in FY03. We also deleted duplicate records. In 10 sequen-
tial “steps,” we pulled all nonduplicate inpatient records
containing any FY03 inpatient care for patients in our
cohort from the following FY03 files:
Step 1. Bedsection file, which represents acute care
Step 2. OBS (Observation) file, which represents
short (e.g., overnight) acute care stays during which the
patient is observed regarding the potential need for
admission to an acute care bed.
Step 3. EXT (Extended Care) file, which represents
long-term care stays (such as rehabilitation stays or nurs-
ing home stays).
Step 4. Census file (for Bedsection, OBS, and EXT),
which include records for all patients who still held inpa-
tient status on the last day of the FY, and thus for whom a
discharge date was not available when the files for that
FY were created.
Step 5. Non-VHA file.
Step 6. Fee basis file.
(These latter two files reflect care received outside of
VHA but with funding for the care provided by VHA.)
We then searched FY04 and FY05 files for any
records that included some FY03 care:
Step 7. Sources 1 through 5, FY04.
Step 8. Fee basis FY04 file (presented separately
from other FY04 files to emphasize that fee basis files are
more likely to contain “late entry” records from prior
Step 9. Sources 1 through 5, FY05.
Step 10. Fee basis FY05 file.
Stages of Processing Record-Level Database of Inpatient
Next, we processed this raw database in sequential
“stages.” Stage A represented the raw file at any given step.
In stage B, we deleted pre-FY03 and post-FY03 care. Spe-
cifically, for records with an admission date earlier than the
first day of FY03, we deleted any days preceding FY03
(i.e., we modified the record to begin on the first day of
FY03), because we were interested in days of care during
FY03, not total LOS for the patient across multiple years.
Similarly, for records with a discharge date later than the
last day of FY03, we modified the record to end on the last
day of FY03.
In stage C, we addressed overlapping stays. Several
types of overlap were observed, as illustrated in Figure 1.
In some cases, the entire stay (admission date through dis-
charge date) was contained within the time interval of
another record. This might happen, for example, if a
patient in a rehabilitation unit was temporarily transferred
to an acute care observation bed for an intercurrent illness
like pneumonia. If the patient was not formally discharged
from the rehabilitation facility prior to the transfer, then
the time interval of the short-term stay (appearing in the
OBS file) could be bracketed by the interval of the long-
term stay (appearing in the EXT file). In other cases an
overlap occurred (e.g., the admission date of one record
fell between the admission and discharge dates of a subse-
quent record, or the discharge date of a record fell
between the admission and discharge date of a subsequent
record). In other cases, contiguous admissions occurred
(i.e., the discharge date of one record was the same as the
admission date of a subsequent record). For all these over-
lap cases (which could involve a pair of records or even
three or more records), we created a single contiguous
episode of FY03 inpatient care by assigning the admission
date to be the first admission date in FY03 among the
overlapping records and the discharge date to be the last
discharge date in FY03 among the overlapping records.
The resulting file at step 10, stage C, was our final record-
level file of inpatient stays.
We calculated LOS for each record as the number of
days from its start through end dates. At each step/stage,
we calculated a cumulative LOS for each patient by add-
ing the record-level LOS for all records included in that
To identify patients with MHC, we used the Agency
for Health Research and Quality’s Clinical Classifica-
tions Software (with minor modifications) to generate a
list of ICD-9 codes indicating the presence of MHC .
A patient was assigned a “Yes” for MHC status if he/she
had at least one instance of an MHC ICD-9 code in any
inpatient record or outpatient face-to-face clinic visit at
baseline (FY01–02) and at least one confirmatory ICD-9
in the study period (FY03). If he/she had no instance of
an MHC ICD-9 in FY01 through 03, then he/she was
assigned MHC status “No.” Otherwise, MHC status was
JRRD, Volume 47, Number 8, 2010
considered “Possible.” That is, the MHC Possible group
represents those patients who had an MHC diagnosis in
the baseline period or in the study period, but not both.
Cases with MHC Possible status were excluded from
main analyses; this allowed us to compare LOS in two
more sharply defined groups (MHC Yes vs MHC No).
We tabulated the number of records and calculated
mean LOS within each cell of a 10 × 3 matrix represent-
ing the steps and stages of database development. Next,
in each cell, we calculated mean LOS as a function of
MHC status. We then calculated the difference () in
mean LOS among patients with MHC versus those with-
out MHC and compared mean LOS for the MHC Yes
versus MHC No groups using a two-sample t-test. We
applied Bonferroni correction for compounding of Type I
error across multiple comparisons. Results of hypothesis
tests are declared statistically significant for p < 0.05
after Bonferroni correction.
Among the 784,321 patients with diabetes in the full
cohort, 152,591 were identified as having evidence of an
MHC diagnosis (MHC Yes). Among the subset of 92,255
patients who received any inpatient care in FY03 (based on
step 10, stage C), 39,452 had MHC Yes. Table 1 presents
the age, sex, Physical Comorbidity Index score (a count
from 0–35, developed for case mix adjustment in VHA
patients [22–23]), and primary care use in the full cohort
and in the subset who used inpatient care, by MHC status.
Table 2 catalogs the number of records and LOS at
each step/stage in the database assembly process. The
cumulative number of patients who are identified as hav-
ing received inpatient care in FY03 (based on stage C)
increases progressively from step 1 to step 10 (as do the
number of records). For example, when the OBS file was
added to the Bedsection file, an additional 10,660 records
were added for stays that did not perfectly duplicate a
Bedsection file stay for that patient. This is expected,
because additional evidence of inpatient care is added at
each step. More noteworthy is that some steps contribute
more records than others.
The number of records does not change at stage B
(compared with stage A), because this processing step
truncates records (to include only inpatient days during
FY03) but does not delete records. However, at stage C
(record consolidation), the number of records drops sub-
stantially, because overlapping stays are merged into a
single, longer stay.
Patterns I–V of overlap between pairwise records of an individual
patient and record-level frequency of each pattern at step 10, stage C.
FRAYNE et al. Algorithm choice for length of stay
Consistent with these observations, mean LOS at
stage C increased progressively with sequential steps
(i.e., as more sources of data were added), except at step
2 (where patients with short OBS stays were added) and
at step 5 (where patients with non-VHA stays were
added). Similarly, mean LOS decreased progressively
with sequential stages. That is, mean LOS decreased
from stage A to stage B as non-FY03 days were deleted
(which would be relevant to a study like ours that focuses
on care received in a single FY). Mean LOS also
decreased from stage B to stage C as overlapping days
were deleted (which would be relevant to the accuracy of
the LOS estimate in any study design). Across the 10 × 3
matrix, mean LOS ranged from 13.8 to 74.9 days.
Table 3 presents LOS by MHC status at every step/
stage in the database assembly process. The calculated dif-
ference () in mean LOS between the MHC Yes and the
MHC No groups varied markedly by algorithm and was
statistically significant (p < 0.001) at every step/stage. Cor-
rection for multiple comparisons did not statistically affect
any findings significantly. As illustrated in Figure 2, step
1, = 4.1 at stage A and 3.8 at stage C. In contrast, at step
10, = 57.8 at stage A and 15.5 at stage C (p < 0.01 for
both between-algorithm comparisons of the values of ).
To obtain the LOS in stage C, for each pair of overlap-
ping records, we generated a single record by setting the
FY03 admission date as the earliest of the two admission
dates and the FY03 discharge date as the latest of the two
Characteristics of cohort by mental health condition (MHC) status (full cohort and subset who used Veterans Health Administration inpatient
Full Cohort, n = 784,321
Age (years, mean ± SD)62.1 ± 11.6
Physical Comorbidity Index (mean ± SD)3.6 ± 2.4
Used Primary Care in FY03 (%)93.6
*Inpatient user cohort selected from step 10, stage C.
FY = fiscal year, SD = standard deviation.
Inpatient Users,* n = 92,255
61.4 ± 11.9
4.6 ± 2.8
69.6 ± 10.3
2.8 ± 2.0
69.1 ± 10.5
4.5 ± 2.7
Effect of sequential data assembly steps and data cleaning stages on number of patients identified as having received inpatient care and on count
of inpatient records and mean length of stay (LOS).
Stage AStage B
1. Bedsection FY0377,817 173,707173,690173,690
2. OBS FY03 81,489 10,660184,350 184,350
3. EXT FY0385,19814,844 199,194 199,194
4. Census FY0386,9906,990206,184 206,184
5. Non-VHA FY0389,1355,438 211,622211,622
6. Fee FY0390,558 15,107226,729 226,729
7. FY04 Records90,6895,898 232,627232,627
8. Fee FY0492,0686,829239,456239,456
9. FY05 Records 92,1811,569 241,025241,025
10. Fee FY0592,255293 241,318 241,318
Note: To create table, we started with step 1 and completed cells across each stage sequentially. Then, for the step 2 analyses, we started with records from steps 1
and 2 and completed cells across each stage sequentially. Analyses for each subsequent step similarly included records from all prior steps. Stages were stage A
(original record), stage B (delete days prior to first day of FY03 and after last day of FY03), and stage C (consolidate overlapping stays).
*Reflects cumulative number of patients who received inpatient care in FY03 at each step at stage C. Inpatient records were drawn from patients in analytical cohort
(n = 784,321).
EXT = extended care, FY = fiscal year, OBS = observation, SD = standard deviation, VHA = Veterans Health Administration.
Number of Records
LOS (days), Mean ± SD
15.3 ± 21.1
14.9 ± 22.8
27.3 ± 103.8
37.6 ± 198.4
37.2 ± 196.2
40.4 ± 198.8
58.3 ± 370.4
58.4 ± 368.5
74.9 ± 518.8
74.8 ± 518.6
14.8 ± 20.1
14.4 ± 19.8
21.7 ± 37.8
26.7 ± 53.1
26.5 ± 52.8
29.8 ± 61.9
35.7 ± 91.2
36.1 ± 93.0
39.4 ± 113.4
39.4 ± 113.4
14.2 ± 19.4
13.8 ± 19.2
20.7 ± 36.1
25.6 ± 51.2
25.4 ± 50.9
27.7 ± 56.2
28.0 ± 57.4
28.3 ± 58.9
28.4 ± 59.3
28.4 ± 59.3
JRRD, Volume 47, Number 8, 2010
discharge dates. We repeated this process iteratively until
all pairwise overlaps were addressed. This data processing
stage was the most involved, because it needed to account
for multiple potential overlap patterns, as illustrated sche-
matically in Figure 1. The most common overlap pattern
(pattern I) was contiguous records, i.e., where the dis-
charge date of one record was the admission date of the
following record. This pattern would happen, for example,
if a patient were admitted to one bed section (e.g., to the
Psychiatry Department for suicidal ideation) and then
transferred to another bed section (e.g., to General Medi-
cine for a hospital-acquired infection). Of note, we used
the Bedsection files for these analyses. VA Bedsection files
create a new record each time a patient transfers to a differ-
ent clinical service (“bedsection”) during a hospital stay.
This is in contrast to the VA Main files, which create a new
record for each stay; all contiguous bedsection stays are
combined in a single record. Had we used the Main file
instead of the Bedsection file, we expect that we would not
have encountered this particular form of overlap. Other
overlap patterns were also observed, as Figure 1 shows.
Of note, step 10, stage B, yielded LOSs of more than 365
days for 3.2 percent of the MHC Yes group and 1.4 percent
of the MHC No group, clearly representing a residual
problem with the algorithm; in contrast, no patient had
LOS greater than 365 days at stage C. This finding sup-
ports the importance of the stage C processing.
In a subsidiary analysis, we found that both the admis-
sion and discharge dates fell within FY03 for 91 percent of
records at step 10, stage A. In those instances, the full LOS
for that episode of care was captured and no truncation
Our main analyses excluded patients who had MHC
Possible status (i.e., those patients who had an MHC diag-
nosis in the baseline period or in the study period, but not
both). In another subsidiary analysis (see online Appen-
dix), we repeated the main analysis in the initial cohort
Effect of sequential data assembly steps/data cleaning stages on fiscal year (FY) 2003 length of stay (LOS) calculations by mental health
condition (MHC) status.
1. Bedsection FY03 17.7
2. OBS FY0317.3
3. EXT FY0336.6
4. Census FY03 51.6
27.2 24.4 34.0
5. Non-VHA FY0351.0
6. Fee FY0356.1
7. FY04 Records82.1
8. Fee FY0482.3
40.5 41.8 47.8
9. FY05 Records108.0
50.2 57.8 52.5
10. Fee FY05 107.9
50.1 57.8 52.4
Note: Every difference () between mean LOS for MHC Yes vs MHC No in this table is statistically significant at p < 0.001. Stages were stage A (original record),
stage B (delete days prior to first day of FY03 and after last day of FY03), and stage C (consolidate overlapping stays). Two sample t-tests were conducted for two
key comparisons in this table: comparing within step 1 for stage A vs stage C and within step 10 for stage A vs stage C (p < 0.01 for both comparisons).
= mean LOS (MHC Yes) minus mean LOS (MHC No), EXT = extended care, OBS = observation, VHA = Veterans Health Administration.
Effect of sequential data assembly steps and data cleaning stages on
fiscal year 2003 number of inpatient days. MHC = mental health
condition, = mean length of stay (LOS) (MHC Yes) – mean LOS
FRAYNE et al. Algorithm choice for length of stay
(n = 911,451), calculating mean LOS as a function of MHC
as a three-way variable (MHC Yes, MHC Possible, MHC
No). Mean LOS for the MHC Possible group was consis-
tently intermediate between that for the MHC Yes and
MHC No groups. For example, for the MHC Possible
group, mean LOS was 16.2 at step 1, stage A; 15.1 at step
1, stage C; 90.3 at step 10, stage A; and 34.4 at step 10,
Choices about what algorithm to use when identify-
ing episodes of inpatient care substantially alter conclu-
sions about the overall intensity of inpatient use and
about MHC-related disparities in LOS. Not searching
across all appropriate sources of data can lead to failure
to capture a substantial amount of inpatient care, thus
leading to underestimates of LOS. Decisions about how
to process records can likewise influence calculated LOS.
While other studies have documented that algorithm
choice can influence conclusions drawn from VHA data
[11–13], we are not aware of this result having been pre-
viously documented for LOS.
Researchers have access to many sources of data about
VHA patients’ nonambulatory care. Indeed, the large num-
ber of sources can bewilder investigators new to VHA
administrative data, who may be unsure which files to
select. Fortunately, the technical manuals developed by the
Department of Veterans Affairs Information Resource Cen-
ter (available at http://www.virec.research.va.gov/) and the
Department of Veterans Affairs Health Economics Resource
Center (available at http://www.herc.research.va.gov/)
explain these files in detail. Our data provide further empiric
information to help guide these decisions. First, our results
confirm that adding more data sources identifies more inpa-
tient days. Second, our results indicate that the EXT and
Census files are especially important sources of incremental
days of inpatient care. Third, our results indicate that adding
more data sources also changes conclusions about the mag-
nitude of effect (though not the direction of effect) of MHC
on LOS. The step at which this has a particularly pro-
nounced effect is the addition of EXT files, indicating that,
compared with patients with no MHC, patients with MHC
have disproportionately more frequent or prolonged stays in
the long-term care setting.
Investigators using any VHA database need to exam-
ine data closely to determine whether data processing steps
are necessary. In the case of inpatient files, our data indi-
cate that in addition to the standard procedure of deleting
pure duplicate records, investigators must account for
overlapping stays (wherein a single day can be counted
twice) and, for studies such as ours that focus on a single
year of care, to truncate days falling before or after the FY
of interest. Such pitfalls could, in some cases, reflect data
quality problems, such as a data-entry error in admission
or discharge date. However, in many cases, they may not
represent deficits in the quality of VHA administrative
data, but instead may reflect VHA clinical/administrative
record-keeping practices. For example, a single stay could
legitimately be recorded in more than one file if these files
are used differently. Similarly, a fee basis stay (with the
correct admission and discharge dates) could be filed in a
subsequent year’s records if a delay occurred in receipt of
the bill from the outside vendor. Regardless of whether
some of these factors represent data quality problems,
investigators need to account for them; if not, some
patients will have inflated estimates of LOS. Indeed, with-
out such corrections, some patients will appear to be on
inpatient status for more than 365 days in a single FY.
While the focus of this study is on the issue of algo-
rithm choice for calculation of LOS, we use MHC-related
disparities in LOS as a case study to illustrate what can
happen if such issues are not considered. Health services
researchers frequently examine disparities in processes
and outcomes of care. Historically, interest in disparities
related to characteristics like race, sex, and age has been
great, but emerging evidence suggests that disparities
related to MHC status are also common [9,24]. We dem-
onstrated that the magnitude of MHC-related differences
in LOS varied markedly as a function of LOS algorithm.
Thus, the methodological issues raised here are not just
theoretical: algorithm choice can have marked effects on
conclusions in healthcare disparities research.
In the course of conducting analyses for this illustra-
tive example, a subsidiary benefit was that informative
findings about associations between MHC status and
LOS emerged. Patients with MHC spent more of FY03
on inpatient status than did patients with no MHC; this
was a consistent and robust finding across every algo-
rithm examined. This finding is consistent with other
studies that have shown heavier use of inpatient services
by patients with MHC [6,14–19]. Our study also shows
that some types of care (e.g., EXT) are associated with a
disproportionately greater MHC effect. Another strength
of our approach is that we distinguished between patients
JRRD, Volume 47, Number 8, 2010
with stronger evidence of MHC (i.e., at least one MHC
diagnosis at baseline in FY01–02 and at least one confir-
matory MHC diagnosis in the study period, FY03) and
patients with less certain (Possible) MHC status (i.e.,
presence of an MHC diagnosis either at baseline or in the
study period, but not both). Our subsidiary analyses pro-
vide information about MHC Possible patients, a group
that has not been well characterized in prior work. The
MHC Possible group is likely heterogeneous and includes
patients with an erroneous MHC diagnosis, with transient
or resolved MHC, or with less severe MHC, as well as
patients who receive part of their care outside the VHA
system. Mean LOS for the MHC Possible group consis-
tently fell between the mean LOS observed for the MHC
Yes and the MHC No groups.
Interpretation of our findings is subject to several
caveats. First, our aim was to calculate total number of
days spent on inpatient status during FY03; values should
not be interpreted as indicating total LOS across years.
However, for 91 percent of records, the patient’s com-
plete stay was contained within FY03. Second, we did
not use the VHA Decision Support System (DSS) Outpa-
tient (OPAT) file as a data assembly step. In the OPAT
file, Stay Type 42, Bedsection 80 refers to nursing home
care reimbursed by VHA in any particular month. How-
ever, dates of admission and discharge could not be accu-
rately generated from that source. Third, our focus was
on VHA use. Depending on an investigator’s study ques-
tion, capturing inpatient days spent in other settings
might also be important, such as days identified from
Medicare claims data, which can be linked to VHA
administrative data . Fourth, because the purpose of
our study was to identify periods during which the patient
was on nonoutpatient status, our LOS calculations
included both acute care and long-term care days. Studies
focusing on one or the other setting might need to con-
sider other methodological issues. For example, a
patient’s stay in a skilled nursing facility could have short
gaps (e.g., for a brief acute care stay), which might not be
captured with the databases used. Fifth, our main analy-
ses excluded patients whose MHC status could not be
ascertained with certainty (MHC Possible), so LOS esti-
mates cannot be generalized to all VHA patients. Subsidiary
analyses suggested that these excluded patients had inter-
mediate LOS and that algorithm choice similarly affected
LOS calculations for them. Sixth, MHC diagnoses came
from ICD-9 diagnosis codes in VHA administrative data
rather than from direct assessment of patients’ MHC.
Given the known problem of underdiagnosis of MHC
[26–27], some patients with MHC are likely included in
the MHC No group. This would be expected to bias
results toward the null.
This study examines methods that should be consid-
ered when an algorithm is developed that uses VHA data
to calculate LOS. The specific algorithm selected will
depend on the research question, such as—
• What types of inpatient care are of interest? For exam-
ple, is the focus on acute care, extended care, care
received on a fee basis outside of VHA or some com-
bination of these sources? If rehabilitative/extended
care is the focus, will additional sources (e.g., VHA
EXT, fee basis, non-VHA and DSS OPAT files, as
well as Medicare or Medicaid files) be queried, and
how will multiyear stays be addressed?
• Is the focus on care received in a particular time inter-
val (such as one FY) or on a full episode of inpatient
care? If the former, will subsequent years’ files be
searched for stays recorded in a subsequent FY, and
what is the expected incremental benefit versus cost
of pulling data from multiple years? If the latter, how
many years of data will be searched to identify the
complete LOS, which could potentially span many
• Is the objective to characterize private sector inpatient
care received as well, and if so, should other sources
(such as Medicare claims data) be queried?
Careful consideration of these study design issues
should yield an algorithm tailored to a particular study’s
Accounting for the methodological issues raised here
should help VHA health services researchers avoid pit-
falls in calculation of VHA LOS, such as failure to cap-
ture care recorded in more obscure data sources (leading
to underestimates of LOS) or duplicate counting of some
days of care (leading to overestimates of LOS). This
result is expected to support more robust estimates for
economic analyses, since inpatient costs contribute dis-
proportionately to total cost of VHA care. This result is
also expected to enhance the accuracy of data VHA uses
in its evidence-based efforts to redesign its healthcare
delivery systems, which aim to improve the quality of
care provided to veterans.
FRAYNE et al. Algorithm choice for length of stay
Study concept and design: S. M. Frayne, E. Berg, T. H. Holmes,
R. Moos, D. R. Berlowitz, D. R. Miller, L. Pogach.
Acquisition of data: S. M. Frayne, E. Berg, D. R. Miller.
Analysis and interpretation of data: S. M. Frayne, E. Berg, T. H. Holmes,
R. Moos, D. R. Berlowitz, D. R. Miller, L. Pogach, K. Laungani.
Drafting of manuscript: S. M. Frayne, T. H. Holmes, E. Berg.
Critical revision of manuscript for important intellectual content:
S. M. Frayne, E. Berg, T. H. Holmes, R. Moos, D. R. Berlowitz,
D. R. Miller, L. Pogach, K. Laungani, V. W. Jackson.
Statistical analysis: T. H. Holmes, E. Berg.
Obtained funding: S. M. Frayne.
Administrative, technical, or material support: E. Berg, K. Laungani.
Study supervision: S. M. Frayne.
Financial Disclosures: The authors have indicated that no competing
Funding/Support: This material was based on work supported in part
by the National Institutes of Health (grant NIDDK 1 R01 DK071202-
01) and by VA Health Services Research and Development Service,
(grant RCS 90-001).
Additional Contributions: The authors are grateful to Ciaran Phibbs,
PhD; Todd Wagner, PhD; and Susan Schmitt, PhD, for their advice and
conceptual input during the process of developing the LOS algorithm.
The views expressed in this article are those of the authors and do not
necessarily represent the views of the VA.
1. Moos RH, Mertens JR. Patterns of diagnoses, comorbidi-
ties, and treatment in late-middle-aged and older affective
disorder patients: Comparison of mental health and medi-
cal sectors. J Am Geriatr Soc. 1996;44(6):682–88.
2. Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbid-
ity measures for use with administrative data. Med Care.
1998;36(1):8–27. [PMID: 9431328]
3. Ronis DL, Bates EW, Garfein AJ, Buit BK, Falcon SP,
Liberzon I. Longitudinal patterns of care for patients with
posttraumatic stress disorder. J Trauma Stress. 1996;9(4):
763–81. [PMID: 8902745]
4. Saitz R, Ghali WA, Moskowitz MA. The impact of alco-
hol-related diagnoses on pneumonia outcomes. Arch Intern
Med. 1997;157(13):1446–52. [PMID: 9224223]
5. Cartwright WS, Ingster LM. A patient-based analysis of
drug disorder diagnoses in the Medicare population. Health
Care Financ Rev. 1993;15(2):89–101. [PMID: 10171899]
6. Ettner SL, Hermann RC. Inpatient psychiatric treatment of
elderly Medicare beneficiaries. Psychiatr Serv. 1998;49(9):
1173–79. [PMID: 9735958]
7. Clague JE, Craddock E, Andrew G , Horan MA, Pendleton
N. Predictors of outcome following hip fracture. Admis-
sion time predicts length of stay and in-hospital mortality.
Injury. 2002;33(1):1–6. [PMID: 11879824]
8. Wigder HN, Johnson C, Shah MR. Length of stay predicts
patient and family satisfaction with trauma center services.
Am J Emerg Med. 2003;21(7):606–7. [PMID: 14655246]
9. Frayne SM, Halanych JH, Miller DR, Wang F, Lin H,
Pogach L, Sharkansky EJ, Keane TM, Skinner KM, Rosen
CS, Berlowitz DR. Disparities in diabetes care: Impact of
mental illness. Arch Intern Med. 2005;165(22):2631–38.
10. Hornbrook MC, Hurtado AV, Johnson RE. Health care epi-
sodes: Definition, measurement and use. Med Care Rev.
1985;42(2):163–218. [PMID: 10274864]
11. Frayne SM, Yano EM, Nguyen VQ, Yu W, Ananth L, Chiu
VY, Phibbs CS. Gender disparities in Veterans Health
Administration care: Importance of accounting for veteran
status. Med Care. 2008;46(5):549–53. [PMID: 18438204]
12. Halanych JH, Wang F, Miller DR, Pogach LM, Lin H, Ber-
lowitz DR, Frayne SM. Racial/ethnic differences in diabetes
care for older veterans: Accounting for dual health system
use changes conclusions. Med Care. 2006;44(5):439–45.
13. Borzecki AM, Wong AT, Hickey EC, Ash AS, Berlowitz DR.
Identifying hypertension-related comorbidities from adminis-
trative data: What’s the optimal approach? Am J Med Qual.
2004;19(5):201–6. [PMID: 15532912]
14. Ashton CM, Petersen NJ, Wray NP, Yu HJ. The Veterans
Affairs medical care system: Hospital and clinic utilization
statistics for 1994. Med Care. 1998;36(6):793–803.
15. Verbosky LA, Franco KN, Zrull JP. The relationship
between depression and length of stay in the general hospi-
tal patient. J Clin Psychiatry. 1993;54(5):177–81.
16. Savoca E. Psychiatric co-morbidity and hospital utilization in
the general medical sector. Psychol Med. 1999;29(2):457–64.
JRRD, Volume 47, Number 8, 2010
17. Saravay SM, Steinberg MD, Weinschel B, Pollack S, Alo-
vis N. Psychological comorbidity and length of stay in the
general hospital. Am J Psychiatry. 1991;148(3):324–29.
18. Bressi SK, Marcus SC, Solomon PL. The impact of psychi-
atric comorbidity on general hospital length of stay. Psychi-
atr Q. 2006;77(3):203–9. [PMID: 16958003]
19. Sayers SL, Hanrahan N, Kutney A, Clarke SP, Reis BF,
Riegel B. Psychiatric comorbidity and greater hospitaliza-
tion risk, longer length of stay, and higher hospitalization
costs in older adults with heart failure. J Am Geriatr Soc.
2007;55(10):1585–91. [PMID: 17714458]
20. Miller DR, Safford MM, Pogach LM. Who has diabetes?
Best estimates of diabetes prevalence in the Department of
Veterans Affairs based on computerized patient data. Dia-
betes Care. 2004;27 Suppl 2:B10–21. [PMID: 15113777]
21. Clinical Classifications Software (CCS) for ICD-9-CM
[Internet]. Rockville (MD): Healthcare Cost and Utilization
Project; 2008. Available from:
22. Selim AJ, Fincke G , Ren XS. The comorbidity index. In:
Goldfield N, Pine M, Pine J, editors. Measuring and man-
aging health care quality: Procedures, techniques, and pro-
tocols. 2nd ed. New York (NY): Aspen; 2002.
23. Selim AJ, Fincke G , Ren XS, Lee A, Rogers WH, Miller
DR, Skinner KM, Linzer M, Kazis LE. Comorbidity assess-
ments based on patient report: Results from the Veterans
Health Study. J Ambul Care Manage. 2004;27(3):281–95.
24. Druss BG , Bradford DW, Rosenheck RA, Radford MJ,
Krumholz HM. Mental disorders and use of cardiovascular
procedures after myocardial infarction. JAMA. 2000;283(4):
506–11. [PMID: 10659877]
25. Fleming C, Fisher ES, Chang CH, Bubolz TA, Malenka DJ.
Studying outcomes and hospital utilization in the elderly.
The advantages of a merged data base for Medicare and
Veterans Affairs hospitals. Med Care. 1992;30(5):377–91.
26. Kimerling R, Ouimette P, Prins A, Nisco P, Lawler C,
Cronkite R, Moos RH. Brief report: Utility of a short
screening scale for DSM-IV PTSD in primary care. J Gen
Intern Med. 2005;21(1):65–67. [PMID: 16423126]
27. Pérez-Stable EJ, Miranda J, Muñoz RF, Ying YW. Depres-
sion in medical outpatients. Underrecognition and misdiag-
nosis. Arch Intern Med. 1990;150(5):1083–88.
Submitted for publication August 3, 2009. Accepted in
revised form January 13, 2010.
This article and any supplementary material should be
cited as follows:
Frayne SM, Berg E, Holmes TH, Laungani K, Berlowitz
DR, Miller DR, Pogach L, Jackson VW, Moos R. Mental
illness-related disparities in length of stay: Algorithm
choice influences results. J Rehabil Res Dev. 2010:47(8):