Developing and validating a diabetes database in a large health system

University of Pittsburgh, Department of Epidemiology, Graduate School of Public Health, Pittsburgh, United States.
Diabetes Research and Clinical Practice (Impact Factor: 2.54). 04/2007; 75(3):313-9. DOI: 10.1016/j.diabres.2006.07.007
Source: PubMed


One component of clinical information systems is a registry of patients. Registries allow providers to identify gaps in care at the population level. Registries also allow for rapid cycle continuous quality improvement, targeted practice change and improved outcomes. Most registries are built based on membership with an insurer or other selection criteria. Little, if any data exist on registries representing demographically heterogeneous populations.
Administrative and clinical data for the period 1/1/2000-12/30/03 were examined. In total, 46,082,941 lab reports, 233,292,544 medical records, and 9,351,415 medical record abstracts, representing approximately 2 million unique patients were searched. The diabetes source population was identified by presence of any one of the following criteria: ICD-9 code 250 (diabetes) for inpatient, emergency room or outpatient visits; any hemoglobin A1c result; blood glucose >200mg/dl; or diabetes medication. A diagnosis of diabetes was verified by trained chart reviewers on a sample of patients. Single indicators and combinations were examined to determine optimal identification of these cases.
In two separate validation studies, using two or more indicators or outpatient diagnosis maximized positive predictive value (PPV) (96 and 97%) and sensitivity (99 and 100%) and identified 55,807 individuals. When all patients with a single indicator of outpatient diagnosis (which had the highest single PPV of 94 and 95%) were included together with those having >or=2 indicators, the final sample size was 65,725.
Two or more indicators or an out-patient-diagnosis identifies a sizeable and unselective diabetes database which can be used to track processes and outcomes.

32 Reads
  • Source
    • "Finally, we aimed to focus our study on patients with type 2 diabetes, but had to accept limitations in the granularity of ICD9-CM codes used for billing. Studies have been largely unsuccessful in validating algorithms to distinguish between diabetes types [13-15,31]. In order to estimate the extent of potential misclassification we provide results of a sensitivity analysis using a conservative approach that did not allow any codes for type 1 or unspecified diabetes in our sub-cohort of type 2 diabetic patients. "
    [Show abstract] [Hide abstract]
    ABSTRACT: With the increasing prevalence of type 2 diabetes in young adulthood, treatment of diabetes in pregnancy faces new challenges. Anti-diabetic drug utilization patterns of pregnant women with pre-existing diabetes are poorly described. We aim to describe anti-diabetic (AD) agent utilization among diabetic pregnant women. We utilized IMS LifeLink, including administrative claims data of patients in US managed care plans, to establish a retrospective cohort of women, age 18-46 years (N = 96,740) with billed procedures for a live birth, and a 12 month eligibility period before and 3 month after delivery. Diabetes mellitus was identified from >=2 in- or outpatient claims with diagnoses (ICD-9-CM 250.XX) before pregnancy. We estimated the prevalence of AD drugs before, during and after pregnancy, and secular trends across the study period (1999-2009), using linear regression. A sensitivity analysis was conducted to identify the extent of misclassification of trimesters. Almost six percent (n = 5,581) of the live birth cohort had diabetes mellitus. Throughout the study, 48% (1999) and 78% (2009) (p < 0.0001) of diabetic women received AD drugs during pregnancy. The most common AD drugs during pregnancy were insulin, metformin, sulfonylureas, thiazolidinediones (TZD), and combination AD. The annual prevalence of insulin use increased by only 1% from 39% (1999) to 40% (2009) (p = 0.589) during pregnancy, while use of sulfonylureas and metformin increased from 2.5% and 4.2% (1999) to 17.3% and 15.3% (2009) (p < 0.0001), respectively. Insulin and sulfonylurea use steadily increased in prevalence from the 1st to 3rd trimester (16.5% and 3.3% to 33.0% and 7.5%), while metformin and TZD use decreased (11.4% and 1.6% to 3.8% and 0.2%). AD use during pregnancy demonstrates the need for additional investigation regarding safety and efficacy of AD drugs on maternal outcomes.
    BMC Pregnancy and Childbirth 01/2014; 14(1):28. DOI:10.1186/1471-2393-14-28 · 2.19 Impact Factor
  • Source
    • "We developed an electronic case-finding algorithm that accurately identified patients with diabetes at their earliest possible date within a healthcare system using data extracted from an EHR. The performance of our model in identifying patients with diabetes is comparable to other diabetes case-finding algorithms [10-17]. However, the distinct advantage of our automated, real-time algorithm is the timely recognition of diabetes. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Effective population management of patients with diabetes requires timely recognition. Current case-finding algorithms can accurately detect patients with diabetes, but lack real-time identification. We sought to develop and validate an automated, real-time diabetes case-finding algorithm to identify patients with diabetes at the earliest possible date. The source population included 160,872 unique patients from a large public hospital system between January 2009 and April 2011. A diabetes case-finding algorithm was iteratively derived using chart review and subsequently validated (n = 343) in a stratified random sample of patients, using data extracted from the electronic health records (EHR). A point-based algorithm using encounter diagnoses, clinical history, pharmacy data, and laboratory results was used to identify diabetes cases. The date when accumulated points reached a specified threshold equated to the diagnosis date. Physician chart review served as the gold standard. The electronic model had a sensitivity of 97%, specificity of 90%, positive predictive value of 90%, and negative predictive value of 96% for the identification of patients with diabetes. The kappa score for agreement between the model and physician for the diagnosis date allowing for a 3-month delay was 0.97, where 78.4% of cases had exact agreement on the precise date. A diabetes case-finding algorithm using data exclusively extracted from a comprehensive EHR can accurately identify patients with diabetes at the earliest possible date within a healthcare system. The real-time capability may enable proactive disease management.
    BMC Medical Informatics and Decision Making 08/2013; 13(1):81. DOI:10.1186/1472-6947-13-81 · 1.83 Impact Factor
  • Source
    • "Validation studies of administrative data have primarily focused on diagnosis of disease [10-20]. In cancer research, however, the primary data source used for identifying cancer cases is typically a well-established cancer registry; administrative data are not usually used or needed to identify cancer cases. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Validation of administrative data is important to assess potential sources of bias in outcome evaluation and to prevent dissemination of misleading or inaccurate information. The purpose of the study was to determine the completeness and accuracy of endoscopy data in several administrative data sources in the year prior to colorectal cancer diagnosis as part of a larger project focused on evaluating the quality of pre-diagnostic care. Methods Primary and secondary data sources for endoscopy were collected from the Alberta Cancer Registry, cancer medical charts and three different administrative data sources. 1672 randomly sampled patients diagnosed with invasive colorectal cancer in years 2000–2005 in Alberta, Canada were included. A retrospective validation study of administrative data for endoscopy in the year prior to colorectal cancer diagnosis was conducted. A gold standard dataset was created by combining all the datasets. Number and percent identified, agreement and percent unique to a given data source were calculated and compared across each dataset and to the gold standard with respect to identifying all patients who underwent endoscopy and all endoscopies received by those patients. Results The combined administrative data and physician billing data identified as high or higher percentage of patients who had one or more endoscopy (84% and 78%, respectively) and total endoscopy procedures (89% and 81%, respectively) than the chart review (78% for both). Conclusions Endoscopy data has a high level of completeness and accuracy in physician billing data alone. Combined with hospital in/outpatient data it is more complete than chart review alone.
    BMC Health Services Research 10/2012; 12(1):358. DOI:10.1186/1472-6963-12-358 · 1.71 Impact Factor
Show more