Techniques for estimating health care costs with censored data: an overview for the health services researcher.

Division of Cardiology, Schulich Heart Centre and Department of Medicine, Sunnybrook Health Sciences Centre, University of Toronto, Toronto, Ontario, Canada.
ClinicoEconomics and Outcomes Research 01/2012; 4:145-55. DOI: 10.2147/CEOR.S31552
Source: PubMed

ABSTRACT The aim of this study was to review statistical techniques for estimating the mean population cost using health care cost data that, because of the inability to achieve complete follow-up until death, are right censored. The target audience is health service researchers without an advanced statistical background.
Data were sourced from longitudinal heart failure costs from Ontario, Canada, and administrative databases were used for estimating costs. The dataset consisted of 43,888 patients, with follow-up periods ranging from 1 to 1538 days (mean 576 days). The study was designed so that mean health care costs over 1080 days of follow-up were calculated using naïve estimators such as full-sample and uncensored case estimators. Reweighted estimators - specifically, the inverse probability weighted estimator - were calculated, as was phase-based costing. Costs were adjusted to 2008 Canadian dollars using the Bank of Canada consumer price index (
Over the restricted follow-up of 1080 days, 32% of patients were censored. The full-sample estimator was found to underestimate mean cost ($30,420) compared with the reweighted estimators ($36,490). The phase-based costing estimate of $37,237 was similar to that of the simple reweighted estimator.
The authors recommend against the use of full-sample or uncensored case estimators when censored data are present. In the presence of heavy censoring, phase-based costing is an attractive alternative approach.

1 Bookmark
  • [Show abstract] [Hide abstract]
    ABSTRACT: Incompleteness is a key feature of most survival data. Numerous well established statistical methodologies and algorithms exist for analyzing life or failure time data. However, induced censorship invalidates the use of those standard analytic tools for some survival-type data such as medical costs. In this paper, some valid methods currently available for analyzing censored medical cost data are reviewed. Some cautionary findings under different assumptions are envisioned through application to medical costs from colorectal cancer patients. Cost analysis should be suitably planned and carefully interpreted under various meaningful scenarios even with judiciously selected statistical methods. This approach would be greatly helpful to policy makers who seek to prioritize health care expenditures and to assess the elements of resource use.
    Contemporary Clinical Trials 11/2005; 26(5):586-97. · 1.99 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Medical expenditure data typically exhibit certain characteristics that must be accounted for when deriving cost estimates. First, it is common for a small percentage of patients to incur extremely high costs compared to other patients, resulting in a distribution of expenses that is highly skewed to the right. Second, the assumption of homoscedasticity (constant variance) is often violated because expense data exhibit variability that increases as the mean expense increases. In this paper, we describe the use of the generalized linear model for estimating costs, and discuss several advantages that this technique has over traditional methods of cost analysis. We provide an example, applying this technique to the problem of determining an incidence-based estimate of the cost of care for patients with diabetes who suffer a stroke.
    Health Services and Outcomes Research Methodology 05/2000; 1(2):185-202.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We review statistical methods for analysing healthcare resource use and costs, their ability to address skewness, excess zeros, multimodality and heavy right tails, and their ease for general use. We aim to provide guidance on analysing resource use and costs focusing on randomised trials, although methods often have wider applicability. Twelve broad categories of methods were identified: (I) methods based on the normal distribution, (II) methods following transformation of data, (III) single-distribution generalized linear models (GLMs), (IV) parametric models based on skewed distributions outside the GLM family, (V) models based on mixtures of parametric distributions, (VI) two (or multi)-part and Tobit models, (VII) survival methods, (VIII) non-parametric methods, (IX) methods based on truncation or trimming of data, (X) data components models, (XI) methods based on averaging across models, and (XII) Markov chain methods. Based on this review, our recommendations are that, first, simple methods are preferred in large samples where the near-normality of sample means is assured. Second, in somewhat smaller samples, relatively simple methods, able to deal with one or two of above data characteristics, may be preferable but checking sensitivity to assumptions is necessary. Finally, some more complex methods hold promise, but are relatively untried; their implementation requires substantial expertise and they are not currently recommended for wider applied work.
    Health Economics 08/2011; 20(8):897-916. · 2.14 Impact Factor

Full-text (2 Sources)

Available from
May 30, 2014