Temporal and within practice variability in the Health Improvement Network

Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine, Philadelphia, PA, USA.
Pharmacoepidemiology and Drug Safety (Impact Factor: 2.94). 07/2011; 20(9):948-55. DOI: 10.1002/pds.2191
Source: PubMed


The Health Improvement Network (THIN) database is a primary care electronic medical record database in the UK designed for pharmacoepidemiologic research. Matching on practice and calendar year often is used to account for secular trends in time and differences across practices. However, little is known about the consistency within practices across observation years and among practices within a given year, in THIN or other large medical record databases.
We analyzed mortality rates, cancer incidence rates, prescribing rates, and encounter rates across 415 practices from 2000 to 2007 using a practice-year as the unit of observation in separate random and fixed effects longitudinal Poisson regression models. Adjusted models accounted for aggregate practice-level characteristics (smoking, obesity, age, and Vision software experience).
In adjusted models, subsequent calendar years were associated with lower reported mortality rates, increasing cancer reporting rates, increasing prescriptions per patient, and decreasing encounters per patient, with a corresponding linear trend (p < 0.001 for all analyses). For calendar year 2007, the ratio of the 75th percentile to the 25th percentile for crude rate of cancer, mortality, prescriptions, and encounters was 1.63, 1.63, 1.45, and 1.42, respectively. Adjusting for practice characteristics reduced the among-practice variation by approximately 40%.
THIN data are characterized by secular trends and among-practice variation, both of which should be considered in the design of pharmacoepidemiology studies. Whether these are trends in data quality or true secular trends could not be definitively differentiated.

9 Reads
  • Source
    • "A potential limitation is that data analysis taking into consideration clustering by general practice would have provided insight into potential bias resulting from variable data quality and confidence intervals that could have a different width than reported. We find reassuring that others have reported little evidence of such potential bias after matching on practice [18]. Likewise, the use of matching on general practice could result in wider confidence intervals but it could also reduce variability overall [19]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background The initial treatment strategy for patients with type 2 diabetes includes lifestyle change recommendations. When patients are not successful in controlling their blood glucose levels through healthier lifestyle pharmaceutical agents are recommended. The objective of this study is to identify determinants of initial treatment change following initiation of non-insulin antihyperglycaemic treatment (OAD) for UK patients with type 2 diabetes. Methods A retrospective cohort study using primary care data from the Clinical Practice Research Datalink between January 2006 and February 2011. Each patient had an OAD prescription. The main treatment pattern outcomes were discontinuation, switching, augmentation and initiation of insulin. Glycaemic control was assessed using HbA1c. Results 63,060 patients initiated OAD therapy 2006–2010 and 3.4% were prescribed insulin during follow-up. 26% with at least four years of follow-up remained on the initial treatment. Metformin dominated (90%) in UK primary care. Around 75% had a record of HbA1c testing prior to initiating therapy. On initiating OAD, half the patients had HbA1c values >65 mmol/mol and one quarter >80 mmol/mol. The initial values of HbA1c were reduced after 12 months and remained stable. There were 15%-18% of patients whose values increased since initiating OAD. Increased baseline HbA1c is associated with increased chance of augmentation and decreased chance of discontinuation. HbA1c values at 1 year were associated with a three-fold increase in the chance of augmentation, 130% increase in the chance of switching and 14% increase in the chance of discontinuation with each 10 mmol/mol increase. Following initiation of OAD, HbA1c was reduced by an average of 16 mmol/mol during the first year. Conclusion There are patients for whom glycaemic control worsens and a majority remained above the recommended level, suggesting an unmet need despite the availability of many OAD.
    BMC Endocrine Disorders 08/2014; 14(1):73. DOI:10.1186/1472-6823-14-73 · 1.71 Impact Factor
  • Source
    • "This included looking for elements with values that were outside biologically plausible ranges or that changed implausibly over time89 or zero-valued elements.73 Other researchers compared distributions of data values between practices50 101 or with national rates,102 103 or looked at agreement between related elements.38 73 "
    [Show abstract] [Hide abstract]
    ABSTRACT: Objective To review the methods and dimensions of data quality assessment in the context of electronic health record (EHR) data reuse for research. Materials and methods A review of the clinical research literature discussing data quality assessment methodology for EHR data was performed. Using an iterative process, the aspects of data quality being measured were abstracted and categorized, as well as the methods of assessment used. Results Five dimensions of data quality were identified, which are completeness, correctness, concordance, plausibility, and currency, and seven broad categories of data quality assessment methods: comparison with gold standards, data element agreement, data source agreement, distribution comparison, validity checks, log review, and element presence. Discussion Examination of the methods by which clinical researchers have investigated the quality and suitability of EHR data for research shows that there are fundamental features of data quality, which may be difficult to measure, as well as proxy dimensions. Researchers interested in the reuse of EHR data for clinical research are recommended to consider the adoption of a consistent taxonomy of EHR data quality, to remain aware of the task-dependence of data quality, to integrate work on data quality assessment from other fields, and to adopt systematic, empirically driven, statistically based methods of data quality assessment. Conclusion There is currently little consistency or potential generalizability in the methods used to assess EHR data quality. If the reuse of EHR data for clinical research is to become accepted, researchers should adopt validated, systematic methods of EHR data quality assessment.
    Journal of the American Medical Informatics Association 06/2012; 20(1). DOI:10.1136/amiajnl-2011-000681 · 3.50 Impact Factor
  • Source
    • "Incidence rate ratios (IRR) between different population strata were obtained using multivariate Cox proportional hazards regression. We further analysed the incidence rate ratios using separate random effects Poisson regression models to adjust for any effects due to the variable reporting in general practices[23]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: There is pressing need to diagnose lung cancer earlier in the United Kingdom (UK) and it is likely that research using computerised general practice records will help this process. Linkage of these records to area-level geo-demographic classifications may also facilitate case ascertainment for public health programmes, however, there have as yet been no extensive studies of data validity for such purposes. To first address the need for validation, we assessed the completeness and representativeness of lung cancer data from The Health Improvement Network (THIN) national primary care database by comparing incidence and survival between 2000 and 2009 with the UK National Cancer Registry and the National Lung Cancer Audit Database. Secondly, we explored the potential of a geo-demographic social marketing tool to facilitate disease ascertainment by using Experian's Mosaic Public Sector ™ classification, to identify detailed profiles of the sectors of society where lung cancer incidence was highest. Overall incidence of lung cancer (41.4/100, 000 person-years, 95% confidence interval 40.6-42.1) and median survival (232 days) were similar to other national data; The incidence rate in THIN from 2003-2006 was found to be just over 93% of the national cancer registry rate. Incidence increased considerably with area-level deprivation measured by the Townsend Index and was highest in the North-West of England (65.1/100, 000 person-years). Wider variations in incidence were however identified using Mosaic classifications with the highest incidence in Mosaic Public Sector ™types 'Cared-for pensioners, ' 'Old people in flats' and 'Dignified dependency' (191.7, 174.2 and 117.1 per 100, 000 person-years respectively). Routine electronic data in THIN are a valid source of lung cancer information. Mosaic ™ identified greater incidence differentials than standard area-level measures and as such could be used as a tool for public health programmes to ascertain future cases more effectively.
    BMC Public Health 11/2011; 11(1):857. DOI:10.1186/1471-2458-11-857 · 2.26 Impact Factor
Show more


9 Reads
Available from