Administrative Data Algorithms to Identify Second Breast Cancer Events Following Early-Stage Invasive Breast Cancer
Studies of breast cancer outcomes rely on the identification of second breast cancer events (recurrences and second breast primary tumors). Cancer registries often do not capture recurrences, and chart abstraction can be infeasible or expensive. An alternative is using administrative health-care data to identify second breast cancer events; however, these algorithms must be validated against a gold standard.
We developed algorithms using data from 3152 women in an integrated health-care system who were diagnosed with stage I or II breast cancer in 1993-2006. Medical record review served as the gold standard for second breast cancer events. Administrative data used in algorithm development included procedures, diagnoses, prescription fills, and cancer registry records. We randomly divided the cohort into training and testing samples and used a classification and regression tree analysis to build algorithms for classifying women as having or not having a second breast cancer event. We created several algorithms for researchers to use based on the relative importance of sensitivity, specificity, and positive predictive value (PPV) in future studies.
The algorithm with high specificity and PPV had 89% sensitivity (95% confidence interval [CI] = 84% to 92%), 99% specificity (95% CI = 98% to 99%), and 90% PPV (95% CI = 86% to 94%); the high-sensitivity algorithm had 96% sensitivity (95% CI = 93% to 98%), 95% specificity (95% CI = 94% to 96%), and 74% PPV (95% CI = 68% to 78%).
Algorithms based on administrative data can identify second breast cancer events with high sensitivity, specificity, and PPV. The algorithms presented here promote efficient outcomes research, allowing researchers to prioritize sensitivity, specificity, or PPV in identifying second breast cancer events.
Available from: PubMed Central
[Show abstract] [Hide abstract]
ABSTRACT: This paper provides a review of the past, present, and future of public health surveillance-the ongoing systematic collection, analysis, interpretation, and dissemination of health data for the planning, implementation, and evaluation of public health action. Public health surveillance dates back to the first recorded epidemic in 3180 B.C. in Egypt. Hippocrates (460 B.C.-370 B.C.) coined the terms endemic and epidemic, John Graunt (1620-1674) introduced systematic data analysis, Samuel Pepys (1633-1703) started epidemic field investigation, William Farr (1807-1883) founded the modern concept of surveillance, John Snow (1813-1858) linked data to intervention, and Alexander Langmuir (1910-1993) gave the first comprehensive definition of surveillance. Current theories, principles, and practice of public health surveillance are summarized. A number of surveillance dichotomies, such as epidemiologic surveillance versus public health surveillance, are described. Some future scenarios are presented, while current activities that can affect the future are summarized: exploring new frontiers; enhancing computer technology; improving epidemic investigations; improving data collection, analysis, dissemination, and use; building on lessons from the past; building capacity; enhancing global surveillance. It is concluded that learning from the past, reflecting on the present, and planning for the future can further enhance public health surveillance.
[Show abstract] [Hide abstract]
ABSTRACT: BACKGROUND:: A substantial proportion of cancer-related mortality is attributable to recurrent, not de novo metastatic disease, yet we know relatively little about these patients. To fill this gap, investigators often use administrative codes for secondary malignant neoplasm or chemotherapy to identify recurrent cases in population-based datasets. However, these algorithms have not been validated in large, contemporary, routine care cohorts. OBJECTIVE:: To evaluate the validity of secondary malignant neoplasm and chemotherapy codes as indicators of recurrence after definitive local therapy for stage I-III lung, colorectal, breast, and prostate cancer. RESEARCH DESIGN, SUBJECTS, AND MEASURES:: We assessed the sensitivity, specificity, and positive predictive value (PPV) of these codes 14 and 60 months after diagnosis using 2 administrative datasets linked with gold-standard recurrence status information: CanCORS/Medicare (diagnoses 2003-2005) and HMO/Cancer Research Network (diagnoses 2000-2005). RESULTS:: We identified 929 CanCORS/Medicare patients and 5298 HMO/CRN patients. Sensitivity, specificity, and PPV ranged widely depending on which codes were included and the type of cancer. For patients with lung, colorectal, and breast cancer, the combination of secondary malignant neoplasm and chemotherapy codes was the most sensitive (75%-85%); no code-set was highly sensitive and highly specific. For prostate cancer, no code-set offered even moderate sensitivity (≤19%). CONCLUSIONS:: Secondary malignant neoplasm and chemotherapy codes could not identify recurrent cancer without some risk of misclassification. Findings based on existing algorithms should be interpreted with caution. More work is needed to develop a valid algorithm that can be used to characterize outcomes and define patient cohorts for comparative effectiveness research studies.
[Show abstract] [Hide abstract]
ABSTRACT: INTRODUCTION: Much progress has been made in cancer survivorship research, but there are still many unanswered questions that can and need to be addressed by collaborative research consortia.
METHODS: Since 1999, the National Cancer Institute-funded HMO Cancer Research Network (CRN) has engaged in a wide variety of research focusing on cancer survivorship. With a focus on thematic topics in cancer survivorship, we describe how the CRN has contributed to research in cancer survivorship and the resources it offers for future collaborations.
RESULTS: We identified the following areas of cancer survivorship research: surveillance for and predictors of recurrences, health care delivery and care coordination, health care utilization and costs, psychosocial outcomes, cancer communication and decision making, late effects of cancer and its treatment, use of and adherence to adjuvant therapies, and lifestyle and behavioral interventions following cancer treatment.
CONCLUSIONS: With over a decade of experience using cancer data in community-based settings, the CRN investigators and their collaborators are poised to generate evidence in cancer survivorship research.
IMPLICATIONS FOR CANCER SURVIVORS: Collaborative research within these settings can improve the quality of care for cancer survivors within and beyond integrated health care delivery systems.
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.