[Show abstract][Hide abstract] ABSTRACT: Missing observations are common in cluster randomised trials. Approaches
taken to handling such missing data include: complete case analysis,
single-level multiple imputation that ignores the clustering, multiple
imputation with a fixed effect for each cluster and multilevel multiple
We conducted a simulation study to assess the performance of these
approaches, in terms of confidence interval coverage and empirical bias in the
estimated treatment effects. Missing-at-random clustered data scenarios were
simulated following a full-factorial design. An Analysis of Variance was
carried out to study the influence of the simulation factors on each
When the randomised treatment arm was associated with missingness, complete
case analysis resulted in biased treatment effect estimates. Across all the
missing data mechanisms considered, the multiple imputation methods provided
estimators with negligible bias. Confidence interval coverage was generally in
excess of nominal levels (up to 99.8%) following fixed-effects multiple
imputation, and too low following single-level multiple imputation. Multilevel
multiple imputation led to coverage levels of approximately 95% throughout.
The approach to handling missing data was the most influential factor on the
bias and coverage. Within each method, the most important factors were the
number and size of clusters, and the intraclass correlation coefficient.
[Show abstract][Hide abstract] ABSTRACT: Early research in adults admitted to intensive care suggested that tight control of blood glucose during acute illness can be associated with reductions in mortality, length of hospital stay and complications such as infection and renal failure. Prior to our study, it was unclear whether or not children could also benefit from tight control of blood glucose during critical illness.
This study aimed to determine if controlling blood glucose using insulin in paediatric intensive care units (PICUs) reduces mortality and morbidity and is cost-effective, whether or not admission follows cardiac surgery.
Randomised open two-arm parallel group superiority design with central randomisation with minimisation. Analysis was on an intention-to-treat basis. Following random allocation, care givers and outcome assessors were no longer blind to allocation.
The setting was 13 English PICUs.
Patients who met the following criteria were eligible for inclusion: ≥ 36 weeks corrected gestational age; ≤ 16 years; in the PICU following injury, following major surgery or with critical illness; anticipated treatment > 12 hours; arterial line; mechanical ventilation; and vasoactive drugs. Exclusion criteria were as follows: diabetes mellitus; inborn error of metabolism; treatment withdrawal considered; in the PICU > 5 consecutive days; and already in CHiP (Control of Hyperglycaemia in Paediatric intensive care).
The intervention was tight glycaemic control (TGC): insulin by intravenous infusion titrated to maintain blood glucose between 4.0 and 7.0 mmol/l.
This consisted of insulin by intravenous infusion only if blood glucose exceeded 12.0 mmol/l on two samples at least 30 minutes apart; insulin was stopped when blood glucose fell below 10.0 mmol/l.
The primary outcome was the number of days alive and free from mechanical ventilation within 30 days of trial entry (VFD-30). The secondary outcomes comprised clinical and economic outcomes at 30 days and 12 months and lifetime cost-effectiveness, which included costs per quality-adjusted life-year.
CHiP recruited from May 2008 to September 2011. In total, 19,924 children were screened and 1369 eligible patients were randomised (TGC, 694; CM, 675), 60% of whom were in the cardiac surgery stratum. The randomised groups were comparable at trial entry. More children in the TGC than in the CM arm received insulin (66% vs. 16%). The mean VFD-30 was 23 [mean difference 0.36; 95% confidence interval (CI) -0.42 to 1.14]. The effect did not differ among prespecified subgroups. Hypoglycaemia occurred significantly more often in the TGC than in the CM arm (moderate, 12.5% vs. 3.1%; severe, 7.3% vs. 1.5%). Mean 30-day costs were similar between arms, but mean 12-month costs were lower in the TGC than in CM arm (incremental costs -£3620, 95% CI -£7743 to £502). For the non-cardiac surgery stratum, mean costs were lower in the TGC than in the CM arm (incremental cost -£9865, 95% CI -£18,558 to -£1172), but, in the cardiac surgery stratum, the costs were similar between the arms (incremental cost £133, 95% CI -£3568 to £3833). Lifetime incremental net benefits were positive overall (£3346, 95% CI -£11,203 to £17,894), but close to zero for the cardiac surgery stratum (-£919, 95% CI -£16,661 to £14,823). For the non-cardiac surgery stratum, the incremental net benefits were high (£11,322, 95% CI -£15,791 to £38,615). The probability that TGC is cost-effective is relatively high for the non-cardiac surgery stratum, but, for the cardiac surgery subgroup, the probability that TGC is cost-effective is around 0.5. Sensitivity analyses showed that the results were robust to a range of alternative assumptions.
CHiP found no differences in the clinical or cost-effectiveness of TGC compared with CM overall, or for prespecified subgroups. A higher proportion of the TGC arm had hypoglycaemia. This study did not provide any evidence to suggest that PICUs should stop providing CM for children admitted to PICUs following cardiac surgery. For the subgroup not admitted for cardiac surgery, TGC reduced average costs at 12 months and is likely to be cost-effective. Further research is required to refine the TGC protocol to minimise the risk of hypoglycaemic episodes and assess the long-term health benefits of TGC.
Current Controlled Trials ISRCTN61735247.
This project was funded by the NIHR Health Technology Assessment programme and will be published in full in Health Technology Assessment; Vol. 18, No. 26. See the NIHR Journals Library website for further project information.
Health technology assessment (Winchester, England). 04/2014; 18(26):1-210.
[Show abstract][Hide abstract] ABSTRACT: Statistical approaches for estimating treatment effectiveness commonly model the endpoint, or the propensity score, using parametric regressions such as generalised linear models. Misspecification of these models can lead to biased parameter estimates. We compare two approaches that combine the propensity score and the endpoint regression, and can make weaker modelling assumptions, by using machine learning approaches to estimate the regression function and the propensity score. Targeted maximum likelihood estimation is a double-robust method designed to reduce bias in the estimate of the parameter of interest. Bias-corrected matching reduces bias due to covariate imbalance between matched pairs by using regression predictions. We illustrate the methods in an evaluation of different types of hip prosthesis on the health-related quality of life of patients with osteoarthritis. We undertake a simulation study, grounded in the case study, to compare the relative bias, efficiency and confidence interval coverage of the methods. We consider data generating processes with non-linear functional form relationships, normal and non-normal endpoints. We find that across the circumstances considered, bias-corrected matching generally reported less bias, but higher variance than targeted maximum likelihood estimation. When either targeted maximum likelihood estimation or bias-corrected matching incorporated machine learning, bias was much reduced, compared to using misspecified parametric models.
Statistical Methods in Medical Research 02/2014; · 2.36 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Whether an insulin infusion should be used for tight control of hyperglycemia in critically ill children remains unclear.
We randomly assigned children (≤16 years of age) who were admitted to the pediatric intensive care unit (ICU) and were expected to require mechanical ventilation and vasoactive drugs for at least 12 hours to either tight glycemic control, with a target blood glucose range of 72 to 126 mg per deciliter (4.0 to 7.0 mmol per liter), or conventional glycemic control, with a target level below 216 mg per deciliter (12.0 mmol per liter). The primary outcome was the number of days alive and free from mechanical ventilation at 30 days after randomization. The main prespecified subgroup analysis compared children who had undergone cardiac surgery with those who had not. We also assessed costs of hospital and community health services.
A total of 1369 patients at 13 centers in England underwent randomization: 694 to tight glycemic control and 675 to conventional glycemic control; 60% had undergone cardiac surgery. The mean between-group difference in the number of days alive and free from mechanical ventilation at 30 days was 0.36 days (95% confidence interval [CI], -0.42 to 1.14); the effects did not differ according to subgroup. Severe hypoglycemia (blood glucose, <36 mg per deciliter [2.0 mmol per liter]) occurred in a higher proportion of children in the tight-glycemic-control group than in the conventional-glycemic-control group (7.3% vs. 1.5%, P<0.001). Overall, the mean 12-month costs were lower in the tight-glycemic-control group than in the conventional-glycemic-control group. The mean 12-month costs were similar in the two groups in the cardiac-surgery subgroup, but in the subgroup that had not undergone cardiac surgery, the mean cost was significantly lower in the tight-glycemic-control group than in the conventional-glycemic-control group: -$13,120 (95% CI, -$24,682 to -$1,559).
This multicenter, randomized trial showed that tight glycemic control in critically ill children had no significant effect on major clinical outcomes, although the incidence of hypoglycemia was higher with tight glucose control than with conventional glucose control. (Funded by the National Institute for Health Research, Health Technology Assessment Program, U.K. National Health Service; CHiP Current Controlled Trials number, ISRCTN61735247.).
New England Journal of Medicine 01/2014; 2014(370):107-118. · 51.66 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Multilevel models provide a flexible modelling framework for cost-effectiveness analyses that use cluster randomised trial data. However, there is a lack of guidance on how to choose the most appropriate multilevel models. This paper illustrates an approach for deciding what level of model complexity is warranted; in particular how best to accommodate complex variance-covariance structures, right-skewed costs and missing data. Our proposed models differ according to whether or not they allow individual-level variances and correlations to differ across treatment arms or clusters and by the assumed cost distribution (Normal, Gamma, Inverse Gaussian). The models are fitted by Markov chain Monte Carlo methods. Our approach to model choice is based on four main criteria: the characteristics of the data, model pre-specification informed by the previous literature, diagnostic plots and assessment of model appropriateness. This is illustrated by re-analysing a previous cost-effectiveness analysis that uses data from a cluster randomised trial. We find that the most useful criterion for model choice was the deviance information criterion, which distinguishes amongst models with alternative variance-covariance structures, as well as between those with different cost distributions. This strategy for model choice can help cost-effectiveness analyses provide reliable inferences for policy-making when using cluster trials, including those with missing data.
Statistical Methods in Medical Research 12/2013; · 2.36 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: . Multiple imputation (MI) has been proposed for handling missing data in cost-effectiveness analyses (CEAs). In CEAs that use cluster randomized trials (CRTs), the imputation model, like the analysis model, should recognize the hierarchical structure of the data. This paper contrasts a multilevel MI approach that recognizes clustering, with single-level MI and complete case analysis (CCA) in CEAs that use CRTs.
. We consider a multilevel MI approach compatible with multilevel analytical models for CEAs that use CRTs. We took fully observed data from a CEA that evaluated an intervention to improve diagnosis of active labor in primiparous women using a CRT (2078 patients, 14 clusters). We generated scenarios with missing costs and outcomes that differed, for example, according to the proportion with missing data (10%-50%), the covariates that predicted missing data (individual, cluster-level), and the missingness mechanism: missing completely at random (MCAR), missing at random (MAR), or missing not at random (MNAR). We estimated incremental net benefits (INBs) for each approach and compared them with the estimates from the fully observed data, the "true" INBs.
. When costs and outcomes were assumed to be MCAR, the INBs for each approach were similar to the true estimates. When data were MAR, the point estimates from the CCA differed from the true estimates. Multilevel MI provided point estimates and standard errors closer to the true values than did single-level MI across all settings, including those in which a high proportion of observations had cost and outcome data MAR and when data were MNAR.
. Multilevel MI accommodates the multilevel structure of the data in CEAs that use cluster trials and provides accurate cost-effectiveness estimates across the range of circumstances considered.
Medical Decision Making 08/2013; · 2.89 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: DESIGN: Cohort study. SETTING: Sixty-seven adult critical care units. PARTICIPANTS: Adult patients admitted to critical care following actual/suspected TBI with a Glasgow Coma Scale (GCS) score of < 15. INTERVENTIONS: Critical care delivered in a dedicated neurocritical care unit, a combined neuro/general critical care unit within a neuroscience centre or a general critical care unit outside a neuroscience centre. MAIN OUTCOME MEASURES: Mortality, Glasgow Outcome Scale - Extended (GOSE) questionnaire and European Quality of Life-5 Dimensions, 3-level version (EQ-5D-3L) questionnaire at 6 months following TBI. RESULTS: The final Risk Adjustment In Neurocritical care (RAIN) study data set contained 3626 admissions. After exclusions, 3210 patients with acute TBI were included. Overall follow-up rate at 6 months was 81%. Of 3210 patients, 101 (3.1%) had no GCS score recorded and 134 (4.2%) had a last pre-sedation GCS score of 15, resulting in 2975 patients for analysis. The most common causes of TBI were road traffic accidents (RTAs) (33%), falls (47%) and assault (12%). Patients were predominantly young (mean age 45 years overall) and male (76% overall). Six-month mortality was 22% for RTAs, 32% for falls and 17% for assault. Of survivors at 6 months with a known GOSE category, 44% had severe disability, 30% moderate disability and 26% made a good recovery. Overall, 61% of patients with known outcome had an unfavourable outcome (death or severe disability) at 6 months. Between 35% and 70% of survivors reported problems across the five domains of the EQ-5D-3L. Of the 10 risk models selected for validation, the best discrimination overall was from the International Mission for Prognosis and Analysis of Clinical Trials in TBI Lab model (IMPACT) (c-index 0.779 for mortality, 0.713 for unfavourable outcome). The model was well calibrated for 6-month mortality but substantially underpredicted the risk of unfavourable outcome at 6 months. Baseline patient characteristics were similar between dedicated neurocritical care units and combined neuro/general critical care units. In lifetime cost-effectiveness analysis, dedicated neurocritical care units had higher mean lifetime quality-adjusted life-years (QALYs) at small additional mean costs with an incremental cost-effectiveness ratio (ICER) of £14,000 per QALY and incremental net monetary benefit (INB) of £17,000. The cost-effectiveness acceptability curve suggested that the probability that dedicated compared with combined neurocritical care units are cost-effective is around 60%. There were substantial differences in case mix between the 'early' (within 18 hours of presentation) and 'no or late' (after 24 hours) transfer groups. After adjustment, the 'early' transfer group reported higher lifetime QALYs at an additional cost with an ICER of £11,000 and INB of £17,000. CONCLUSIONS: The risk models demonstrated sufficient statistical performance to support their use in research but fell below the level required to guide individual patient decision-making. The results suggest that management in a dedicated neurocritical care unit may be cost-effective compared with a combined neuro/general critical care unit (although there is considerable statistical uncertainty) and support current recommendations that all patients with severe TBI would benefit from transfer to a neurosciences centre, regardless of the need for surgery. We recommend further research to improve risk prediction models; consider alternative approaches for handling unobserved confounding; better understand long-term outcomes and alternative pathways of care; and explore equity of access to postcritical care support for patients following acute TBI. FUNDING: The National Institute for Health Research Health Technology Assessment programme.
Health technology assessment (Winchester, England). 06/2013; 17(23):1-350.
[Show abstract][Hide abstract] ABSTRACT: BACKGROUND: There is increasing evidence that invasive fungal disease (IFD) is more likely to occur in non-neutropenic patients in critical care units. A number of randomised controlled trials (RCTs) have evaluated antifungal prophylaxis in non-neutropenic, critically ill patients, demonstrating a reduction in the risk of proven IFD and suggesting a reduction in mortality. It is necessary to establish a method to identify and target antifungal prophylaxis at those patients at highest risk of IFD, who stand to benefit most from any antifungal prophylaxis strategy. OBJECTIVES: To develop and validate risk models to identify non-neutropenic, critically ill adult patients at high risk of invasive Candida infection, who would benefit from antifungal prophylaxis, and to assess the cost-effectiveness of targeting antifungal prophylaxis to high-risk patients based on these models. DESIGN: Systematic review, prospective data collection, statistical modelling, economic decision modelling and value of information analysis. SETTING: Ninety-six UK adult general critical care units. PARTICIPANTS: Consecutive admissions to participating critical care units. INTERVENTIONS: None. MAIN OUTCOME MEASURES: Invasive fungal disease, defined as a blood culture or sample from a normally sterile site showing yeast/mould cells in a microbiological or histopathological report. For statistical and economic modelling, the primary outcome was invasive Candida infection, defined as IFD-positive for Candida species. RESULTS: Systematic review: Thirteen articles exploring risk factors, risk models or clinical decision rules for IFD in critically ill adult patients were identified. Risk factors reported to be significantly associated with IFD were included in the final data set for the prospective data collection. Data collection: Data were collected on 60,778 admissions between July 2009 and March 2011. Overall, 383 patients (0.6%) were admitted with or developed IFD. The majority of IFD patients (94%) were positive for Candida species. The most common site of infection was blood (55%). The incidence of IFD identified in unit was 4.7 cases per 1000 admissions, and for unit-acquired IFD was 3.2 cases per 1000 admissions. Statistical modelling: Risk models were developed at admission to the critical care unit, 24 hours and the end of calendar day 3. The risk model at admission had fair discrimination (c-index 0.705). Discrimination improved at 24 hours (c-index 0.823) and this was maintained at the end of calendar day 3 (c-index 0.835). There was a drop in model performance in the validation sample. Economic decision model: Irrespective of risk threshold, incremental quality-adjusted life-years of prophylaxis strategies compared with current practice were positive but small compared with the incremental costs. Incremental net benefits of each prophylaxis strategy compared with current practice were all negative. Cost-effectiveness acceptability curves showed that current practice was the strategy most likely to be cost-effective. Across all parameters in the decision model, results indicated that the value of further research for the whole population of interest might be high relative to the research costs. CONCLUSIONS: The results of the Fungal Infection Risk Evaluation (FIRE) Study, derived from a highly representative sample of adult general critical care units across the UK, indicated a low incidence of IFD among non-neutropenic, critically ill adult patients. IFD was associated with substantially higher mortality, more intensive organ support and longer length of stay. Risk modelling produced simple risk models that provided acceptable discrimination for identifying patients at 'high risk' of invasive Candida infection. Results of the economic model suggested that the current most cost-effective treatment strategy for prophylactic use of systemic antifungal agents among non-neutropenic, critically ill adult patients admitted to NHS adult general critical care units is a strategy of no risk assessment and no antifungal prophylaxis. FUNDING: Funding for this study was provided by the Health Technology Assessment programme of the National Institute for Health Research.
Health technology assessment (Winchester, England). 02/2013; 17(3):1-156.
[Show abstract][Hide abstract] ABSTRACT: To compare the cost effectiveness of the three most commonly chosen types of prosthesis for total hip replacement.
Lifetime cost effectiveness model with parameters estimated from individual patient data obtained from three large national databases.
English National Health Service.
Adults aged 55 to 84 undergoing primary total hip replacement for osteoarthritis.
Total hip replacement using either cemented, cementless, or hybrid prostheses.
Cost (£), quality of life (EQ-5D-3L, where 0 represents death and 1 perfect health), quality adjusted life years (QALYs), incremental cost effectiveness ratios, and the probability that each prosthesis type is the most cost effective at alternative thresholds of willingness to pay for a QALY gain.
Lifetime costs were generally lowest with cemented prostheses, and postoperative quality of life and lifetime QALYs were highest with hybrid prostheses. For example, in women aged 70 mean costs were £6900 ($11 000; €8200) for cemented prostheses, £7800 for cementless prostheses, and £7500 for hybrid prostheses; mean postoperative EQ-5D scores were 0.78, 0.80, and 0.81, and the corresponding lifetime QALYs were 9.0, 9.2, and 9.3 years. The incremental cost per QALY for hybrid compared with cemented prostheses was £2500. If the threshold willingness to pay for a QALY gain exceeded £10 000, the probability that hybrid prostheses were most cost effective was about 70%. Hybrid prostheses have the highest probability of being the most cost effective in all subgroups, except in women aged 80, where cemented prostheses were most cost effective.
Cemented prostheses were the least costly type for total hip replacement, but for most patient groups hybrid prostheses were the most cost effective. Cementless prostheses did not provide sufficient improvement in health outcomes to justify their additional costs.
[Show abstract][Hide abstract] ABSTRACT: The number of prosthesis brands used for hip replacement has increased rapidly, but there is little evidence on their effectiveness. We compared patient-reported outcomes, revision rates, and mortality for the three most frequently used brands within each prosthesis type: cemented (Exeter V40 Contemporary, Exeter V40 Duration and Exeter V40 Elite Plus Ogee), cementless (Corail Pinnacle, Accolade Trident, and Taperloc Exceed), and hybrid (Exeter V40 Trilogy, Exeter V40 Trilogy, and CPT Trilogy).
We used three national databases of patients who had hip replacements between 2008 and 2011 in the English NHS to compare functional outcome (Oxford Hip Score (OHS) ranging from 0 (worst) to 48 (best)) in 43,524 patients at six months. We analysed revisions and mortality in 187,201 patients. We used multiple regression to adjust for pre-operative differences. Prosthesis type had an impact on post-operative OHS and revision rates (both p<0.001). Patients with hybrid prostheses had the best functional outcome (mean OHS 39.4, 95%CI 39.1 to 39.7) and those with cemented prostheses the worst (37.7, 37.3 to 38.1). Patients with cemented prostheses had the lowest reported 5-year revision rates (1.3%, 1.2% to 1.4%) and those with cementless prostheses the highest (2.2%, 2.1% to 2.4%). Differences in mortality according to prosthesis type were small and not significant (p = 0.06). Functional outcome varied according to brand among cemented (p = 0.05, with Exeter V40 Duration having the best) and cementless prostheses (p = 0.01, with Corail Pinnacle having the best). Revision rates varied according to brand among hybrids (p = 0.05, with Exeter V40 Trident having the lowest).
Functional outcomes were better with cementless cups and revision rates were lower with cemented stems, which underlies the good overall performance of hybrids. The hybrid Exeter V40 Trident seemed to produce the best overall results. This brand should be considered as a benchmark in randomised trials.
PLoS ONE 01/2013; 8(9):e73228. · 3.53 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Regression, propensity score (PS) and double-robust (DR) methods can reduce selection bias when estimating average treatment effects (ATEs). Economic evaluations of health care interventions exemplify complex data structures, in that the covariate–endpoint relationships tend to be highly non-linear, with highly skewed cost and health outcome endpoints. When either the regression or PS model is correct, DR methods can provide unbiased, efficient estimates of ATEs, but generally the specification of both models is unknown. Regression-adjusted matching can also protect against bias from model misspecification, but has not been compared to DR methods. This paper compares regression-adjusted matching to selected DR methods (weighted regression and augmented inverse probability of treatment weighting) as well as to regression and PS methods for addressing selection bias in cost-effectiveness analyses (CEA). We contrast the methods in a CEA of a pharmaceutical intervention, where there are extreme estimated PSs, hence unstable inverse probability of treatment (IPT) weights. The case study motivates a simulation which considers settings with functional form misspecification in the PS and endpoint regression models (e.g. cost model with log instead of identity link), stable and unstable PS weights. We find that in the realistic setting of unstable IPT weights and misspecifications to the PS and regression models, regression-adjusted matching reports less bias than DR methods. We conclude that regression-adjusted matching is a relatively robust method for estimating ATEs in applications with complex data structures exemplified by CEA.
Health Services and Outcomes Research Methodology 01/2013;
[Show abstract][Hide abstract] ABSTRACT: Objectives
This article proposes an integrated approach to the development, validation, and evaluation of new risk prediction models illustrated with the Fungal Infection Risk Evaluation study, which developed risk models to identify non-neutropenic, critically ill adult patients at high risk of invasive fungal disease (IFD).
Our decision-analytical model compared alternative strategies for preventing IFD at up to three clinical decision time points (critical care admission, after 24 hours, and end of day 3), followed with antifungal prophylaxis for those judged “high” risk versus “no formal risk assessment.” We developed prognostic models to predict the risk of IFD before critical care unit discharge, with data from 35,455 admissions to 70 UK adult, critical care units, and validated the models externally. The decision model was populated with positive predictive values and negative predictive values from the best-fitting risk models. We projected lifetime cost-effectiveness and expected value of partial perfect information for groups of parameters.
The risk prediction models performed well in internal and external validation. Risk assessment and prophylaxis at the end of day 3 was the most cost-effective strategy at the 2% and 1% risk threshold. Risk assessment at each time point was the most cost-effective strategy at a 0.5% risk threshold. Expected values of partial perfect information were high for positive predictive values or negative predictive values (£11 million–£13 million) and quality-adjusted life-years (£11 million).
It is cost-effective to formally assess the risk of IFD for non-neutropenic, critically ill adult patients. This integrated approach to developing and evaluating risk models is useful for informing clinical practice and future research investment.
[Show abstract][Hide abstract] ABSTRACT: Public policy-makers use cost-effectiveness analyses (CEA) to decide which
health and social care interventions to provide. Appropriate methods have not
been developed for handling missing data in complex settings, such as for CEA
that use data from cluster randomised trials (CRTs). We present a multilevel
multiple imputation (MI) approach that recognises when missing data have a
hierarchical structure, and is compatible with the bivariate multilevel models
used to report cost-effectiveness. We contrast the multilevel MI approach with
single-level MI and complete case analysis in a CEA alongside a CRT. The paper
highlights the importance of adopting a principled approach to handling missing
values in settings with complex data structures.
Journal of the Royal Statistical Society Series A (Statistics in Society) 06/2012; · 1.36 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Decision makers require cost-effectiveness estimates for patient subgroups. In nonrandomized studies, propensity score (PS) matching and inverse probability of treatment weighting (IPTW) can address overt selection bias, but only if they balance observed covariates between treatment groups. Genetic matching (GM) matches on the PS and individual covariates using an automated search algorithm to directly balance baseline covariates. This article compares these methods for estimating subgroup effects in cost-effectiveness analyses (CEA). The motivating case study is a CEA of a pharmaceutical intervention, drotrecogin alfa (DrotAA), for patient subgroups with severe sepsis (n = 2726). Here, GM reported better covariate balance than PS matching and IPTW. For the subgroup at a high level of baseline risk, the probability that DrotAA was cost-effective ranged from 30% (IPTW) to 90% (PS matching and GM), at a threshold of £20 000 per quality-adjusted life-year. We then compared the methods in a simulation study, in which initially the PS was correctly specified and then misspecified, for example, by ignoring the subgroup-specific treatment assignment. Relative performance was assessed as bias and root mean squared error (RMSE) in the estimated incremental net benefits. When the PS was correctly specified and inverse probability weights were stable, each method performed well; IPTW reported the lowest RMSE. When the subgroup-specific treatment assignment was ignored, PS matching and IPTW reported covariate imbalance and bias; GM reported better balance, less bias, and more precise estimates. We conclude that if the PS is correctly specified and the weights for IPTW are stable, each method can provide unbiased cost-effectiveness estimates. However, unlike IPTW and PS matching, GM is relatively robust to PS misspecification.
Medical Decision Making 06/2012; · 2.89 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Statistical methods have been developed for cost-effectiveness analyses of cluster randomised trials (CRTs) where baseline covariates are balanced. However, CRTs may show systematic differences in individual and cluster-level covariates between the treatment groups. This paper presents three methods to adjust for imbalances in observed covariates: seemingly unrelated regression with a robust standard error, a 'two-stage' bootstrap approach combined with seemingly unrelated regression and multilevel models. We consider the methods in a cost-effectiveness analysis of a CRT with covariate imbalance, unequal cluster sizes and a prognostic relationship that varied by treatment group. The cost-effectiveness results differed according to the approach for covariate adjustment. A simulation study then assessed the relative performance of methods for addressing systematic imbalance in baseline covariates. The simulations extended the case study and considered scenarios with different levels of confounding, cluster size variation and few clusters. Performance was reported as bias, root mean squared error and CI coverage of the incremental net benefit. Even with low levels of confounding, unadjusted methods were biased, but all adjusted methods were unbiased. Multilevel models performed well across all settings, and unlike the other methods, reported CI coverage close to nominal levels even with few clusters of unequal sizes.
Health Economics 03/2012; 21(9):1101-18. · 2.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The verteporfin photodynamic therapy (VPDT) cohort study aimed to answer five questions: (a) is VPDT in the NHS provided as in randomised trials?; (b) is 'outcome' the same in the nhs as in randomised trials?; (c) is 'outcome' the same for patients ineligible for randomised trials?; (d) is VPDT safe when provided in the NHS?; and (e) how effective and cost-effective is VPDT?
All hospitals providing VPDT in the NHS.
All patients attending VPDT clinics.
Infusion of verteporfin followed by infrared laser exposure is called VPDT, and is used to treat neovascular age-related macular degeneration (nAMD). The VPDT cohort study advised clinicians to follow patients every 3 months during treatment or active observation, retreating based on criteria used in the previous commercial 'TAP' (Treatment of Age-related macular degeneration with Photodynamic therapy) trials of VPDT.
The primary outcome was logarithm of the minimum angle of resolution monocular best-corrected distance visual acuity (BCVA). Secondary outcomes were adverse reactions and events; morphological changes in treated nAMD (wet) lesions; and for a subset of patients, 6-monthly contrast sensitivity, generic and visual health-related quality of life (HRQoL) and resource use. Treated eyes were classified as eligible for the TAP trials (EFT), ineligible (IFT) or unclassifiable (UNC).
Forty-seven hospitals submitted data for 8323 treated eyes in 7748 patients; 4919 eyes in 4566 patients were treated more than 1 year before the last data submission or had completed treatment. Of 4043 eyes with nAMD in 4043 patients, 1227 were classified as EFT, 1187 as IFT and 1629 as UNC. HRQoL and resource use data were available for about 2000 patients. The mean number of treatments in years 1 and 2 was 2.3 and 0.4 respectively. About 50% of eyes completed treatment within 1 year. BCVA deterioration in year 1 did not differ between eligibility groups. EFT eyes lost 11.6 letters (95% confidence interval 10.1 to 13.0 letters) compared with 9.9 letters in VPDT-treated eyes in the TAP trials. EFT eyes had poorer BCVA at baseline than IFT and UNC eyes. Adverse reactions and events were reported for 1.4% of first visits - less frequently than those reported in the TAP trials. Associations between BCVA in the best-seeing eye with HRQoL and community health and social care resource use showed that the 11-letter difference in BCVA between VPDT and sham treatment in the TAP trials corresponded to differences in utility of 0.012 and health and social service costs of £60 and £92 in years 1 and 2, respectively. VPDT provided an incremental cost per quality-adjusted life-year (QALY) of £170,000 over 2 years.
VPDT was administered less frequently than in the TAP trials, with less than half of those treated followed up for > 1 year in routine clinical practice. Deterioration in BCVA over time in EFT eyes was similar to that in the TAP trials. The similar falls in BCVA after VPDT across the pre-defined TAP eligibility groups do not mean that the treatment is equally effective in these groups because deterioration in BCVA can be influenced by the parameters that determined group membership. Safety was no worse than in the TAP trials. The estimated cost per QALY was similar to the highest previous estimate. Although VPDT is no longer in use as monotherapy for neovascular AMD, its role as adjunctive treatment has not been fully explored. VPDT also has potential as monotherapy in the management of vascular malformations of the retina and choroid and with trials underway in neovascularisation due to myopia and polypoidal choroidopathy.
The National Institute for Health Research Health Technology Assessment programme.
Health technology assessment (Winchester, England). 02/2012; 16(6):i-xii, 1-200.
[Show abstract][Hide abstract] ABSTRACT: Propensity score (Pscore) matching and inverse probability of treatment weighting (IPTW) can remove bias due to observed confounders, if the Pscore is correctly specified. Genetic Matching (GenMatch) matches on the Pscore and individual covariates using an automated search algorithm to balance covariates. This paper compares common ways of implementing Pscore matching and IPTW, with Genmatch for balancing time-constant baseline covariates}. The methods are considered when estimates of treatment effectiveness are required for patient subgroups, and the treatment allocation process differs by subgroup. We apply these methods in a prospective cohort study that estimates the effectiveness of Drotrecogin alfa activated, for subgroups of patients with severe sepsis. In a simulation study we compare the methods when the Pscore is correctly specified, and then misspecified by ignoring the subgroup-specific treatment allocation. The simulations also consider poor overlap in baseline covariates, and different sample sizes. In the case study, GenMatch reports better covariate balance than IPTW or Pscore matching. In the simulations with correctly specified Pscores, good overlap and reasonable sample sizes, all methods report minimal bias. When the Pscore is misspecified, GenMatch reports the least imbalance and bias. With small sample sizes, IPTW is the most efficient approach, but all methods report relatively high bias of treatment effects. This study shows that overall GenMatch achieves the best covariate balance for each subgroup, and is more robust to Pscore misspecification than common alternative Pscore approaches.
The International Journal of Biostatistics 01/2012; 8(1):25. · 1.28 Impact Factor