Administrative data algorithms to identify second breast cancer events following early-stage invasive breast cancer.

Group Health Research Institute, 1730 Minor Ave, Ste. 1600, Seattle, WA 98101, USA.
CancerSpectrum Knowledge Environment (Impact Factor: 15.16). 04/2012; 104(12):931-40. DOI: 10.1093/jnci/djs233
Source: PubMed

ABSTRACT Studies of breast cancer outcomes rely on the identification of second breast cancer events (recurrences and second breast primary tumors). Cancer registries often do not capture recurrences, and chart abstraction can be infeasible or expensive. An alternative is using administrative health-care data to identify second breast cancer events; however, these algorithms must be validated against a gold standard.
We developed algorithms using data from 3152 women in an integrated health-care system who were diagnosed with stage I or II breast cancer in 1993-2006. Medical record review served as the gold standard for second breast cancer events. Administrative data used in algorithm development included procedures, diagnoses, prescription fills, and cancer registry records. We randomly divided the cohort into training and testing samples and used a classification and regression tree analysis to build algorithms for classifying women as having or not having a second breast cancer event. We created several algorithms for researchers to use based on the relative importance of sensitivity, specificity, and positive predictive value (PPV) in future studies.
The algorithm with high specificity and PPV had 89% sensitivity (95% confidence interval [CI] = 84% to 92%), 99% specificity (95% CI = 98% to 99%), and 90% PPV (95% CI = 86% to 94%); the high-sensitivity algorithm had 96% sensitivity (95% CI = 93% to 98%), 95% specificity (95% CI = 94% to 96%), and 74% PPV (95% CI = 68% to 78%).
Algorithms based on administrative data can identify second breast cancer events with high sensitivity, specificity, and PPV. The algorithms presented here promote efficient outcomes research, allowing researchers to prioritize sensitivity, specificity, or PPV in identifying second breast cancer events.

  • [Show abstract] [Hide abstract]
    ABSTRACT: Studies on outcomes in bladder cancer rely on accurate methods to identify bladder cancer patients and differentiate bladder cancer stage. Medical record and administrative databases are increasingly used to study cancer incidence, but few have distinguished cancer stage, and none have focused on bladder cancer. In this study, we used data from The UK Health Improvement Network (THIN) to identify patients with bladder cancer using at least one diagnostic code for bladder cancer, and distinguish muscle-invasive from non-invasive disease using a subsequent code for cystectomy. Algorithms were validated against a gold standard of physician-completed questionnaires, pathology reports, and consultant letters. Algorithm performance was evaluated by measuring positive predictive value (PPV) and corresponding 95% CI. Among all patients coded with bladder cancer (n=194), PPV for any bladder cancer was 99.5% (95% CI = 97.2-99.9). PPV for incident bladder cancer was 93.8% (95% CI =89.4-96.7). PPV for muscle-invasive bladder cancer was 70.1% (95% CI = 59.4 - 79.5) in patients with cystectomy (n=95) and 83.9% (95% CI = 66.3 - 94.5) in those with cystectomy plus additional codes for metastases and death (n=31). Using our codes for bladder cancer, the age- and sex- standardized incidence rate (SIR) of bladder cancer in THIN approximated that measured by cancer registries (SIR within 20%), suggesting that sensitivity was high as well. THIN is a valid and novel database for the study of bladder cancer. Our algorithm can be used to examine the epidemiology of muscle-invasive bladder cancer or outcomes following cystectomy for patients with muscle-invasion.
    Cancer Epidemiology Biomarkers & Prevention 11/2014; 24(1). DOI:10.1158/1055-9965.EPI-14-0677 · 4.32 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Large, population-based studies are needed to better understand lymphedema, a major source of morbidity among breast cancer survivors. One challenge is identifying lymphedema in a consistent fashion. We sought to develop and validate an algorithm using Medicare claims to identify lymphedema after breast cancer surgery.
    Journal of Cancer Survivorship 09/2014; DOI:10.1007/s11764-014-0393-z · 3.29 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: To examine the validity of claims data to identify colorectal cancer (CRC) recurrence and determine the extent to which misclassification of recurrence status affects estimates of its association with overall survival in a population-based administrative database. We calculated the accuracy of claims data relative to medical records from one large tertiary hospital to identify CRC recurrence. We estimated the effect of misclassifying recurrence on survival by applying these findings to the linked Surveillance, Epidemiology, and End Results-Medicare data. Of 174 eligible CRC patients identified through medical records, 32 (18.4%) had a recurrence. A claims-based algorithm of secondary malignancy codes yielded a sensitivity of 81% and specificity of 99% for identifying recurrence. Agreement between data sources was almost perfect (kappa: 0.86). In a model unadjusted for misclassification, CRC patients with recurrence were 3.04 times (95% confidence interval: 2.92-3.17) more likely to die of any cause than those without recurrence. In the corrected model, CRC patients with recurrence were 3.47 times (95% confidence interval: 3.06-4.14) more likely to die than those without recurrence. Identifying recurrence in CRC patients using claims data is feasible with moderate sensitivity and high specificity. Future studies can use this algorithm with Surveillance, Epidemiology, and End Results-Medicare data to study treatment patterns and outcomes of CRC patients with recurrence. Copyright © 2015 Elsevier Inc. All rights reserved.
    Annals of Epidemiology 01/2015; 25(4). DOI:10.1016/j.annepidem.2015.01.005 · 2.15 Impact Factor