Article

Exploiting the potential of large databases of electronic health records for research using rapid search algorithms and an intuitive query interface

Department of Informatics, University of Sussex, Brighton, UK.
Journal of the American Medical Informatics Association (Impact Factor: 3.93). 11/2013; 21(2). DOI: 10.1136/amiajnl-2013-001847
Source: PubMed

ABSTRACT UK primary care databases, which contain diagnostic, demographic and prescribing information for millions of patients geographically representative of the UK, represent a significant resource for health services and clinical research. They can be used to identify patients with a specified disease or condition (phenotyping) and to investigate patterns of diagnosis and symptoms. Currently, extracting such information manually is time-consuming and requires considerable expertise. In order to exploit more fully the potential of these large and complex databases, our interdisciplinary team developed generic methods allowing access to different types of user.
Using the Clinical Practice Research Datalink database, we have developed an online user-focused system (TrialViz), which enables users interactively to select suitable medical general practices based on two criteria: suitability of the patient base for the intended study (phenotyping) and measures of data quality.
An end-to-end system, underpinned by an innovative search algorithm, allows the user to extract information in near real-time via an intuitive query interface and to explore this information using interactive visualization tools. A usability evaluation of this system produced positive results.
We present the challenges and results in the development of TrialViz and our plans for its extension for wider applications of clinical research.
Our fast search algorithms and simple query algorithms represent a significant advance for users of clinical research databases.

Download full-text

Full-text

Available from: N. Beloff, Feb 26, 2014
3 Followers
 · 
96 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background: Pharmaceutical clinical trials are primarily conducted across many countries, yet recruitment numbers are frequently not met in time. Electronic health records store large amounts of potentially useful data that could aid in this process. The EHR4CR project aims at re-using EHR data for clinical research purposes. Objective: To evaluate whether the protocol feasibility platform produced by the Electronic Health Records for Clinical Research (EHR4CR) project can be installed and set up in accordance with local technical and governance requirements to execute protocol feasibility queries uniformly across national borders. Methods: We installed specifically engineered software and warehouses at local sites. Approvals for data access and usage of the platform were acquired and terminology mapping of local site codes to central platform codes were performed. A test data set, or real EHR data where approvals were in place, were loaded into data warehouses. Test feasibility queries were created on a central component of the platform and sent to the local components at eleven university hospitals. Results: To use real, de-identified EHR data we obtained permissions and approvals from 'data controllers' and ethics committees. Through the platform we were able to create feasibility queries, distribute them to eleven university hospitals and retrieve aggregated patient counts of both test data and de-identified EHR data. Conclusion: It is possible to install a uniform piece of software in different university hospitals in five European countries and configure it to the requirements of the local networks, while complying with local data protection regulations. We were also able set up ETL processes and data warehouses, to re-use EHR data for feasibility queries distributed over the EHR4CR platform.
    Methods of Information in Medicine 06/2014; DOI:10.3414/ME13-01-0134 · 1.08 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Objectives: The optimal method of identifying people with chronic obstructive pulmonary disease (COPD) from electronic primary care records is not known. We assessed the accuracy of different approaches using the Clinical Practice Research Datalink, a UK electronic health record database. Setting: 951 participants registered with a CPRD practice in the UK between 1 January 2004 and 31 December 2012. Individuals were selected for >= 1 of 8 algorithms to identify people with COPD. General practitioners were sent a brief questionnaire and additional evidence to support a COPD diagnosis was requested. All information received was reviewed independently by two respiratory physicians whose opinion was taken as the gold standard. Primary outcome measure: The primary measure of accuracy was the positive predictive value (PPV), the proportion of people identified by each algorithm for whom COPD was confirmed. Results: 951 questionnaires were sent and 738 (78%) returned. After quality control, 696 (73.2%) patients were included in the final analysis. All four algorithms including a specific COPD diagnostic code performed well. Using a diagnostic code alone, the PPV was 86.5% (77.5-92.3%) while requiring a diagnosis plus spirometry plus specific medication; the PPV was slightly higher at 89.4% (80.7-94.5%) but reduced case numbers by 10%. Algorithms without specific diagnostic codes had low PPVs (range 12.2-44.4%). Conclusions: Patients with COPD can be accurately identified from UK primary care records using specific diagnostic codes. Requiring spirometry or COPD medications only marginally improved accuracy. The high accuracy applies since the introduction of an incentivised disease register for COPD as part of Quality and Outcomes Framework in 2004.
    BMJ Open 07/2014; 4(7):e005540. DOI:10.1136/bmjopen-2014-005540 · 2.06 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Objectives To determine whether gout increases risk of incident coronary heart disease (CHD), cerebrovascular (CVD) and peripheral vascular disease (PVD) in a large cohort of primary care patients with gout, since there have been no such large studies in primary care. Methods A retrospective cohort study was performed using data from the Clinical Practice Research Datalink (CPRD). Risk of incident CHD, CVD and PVD was compared in 8386 patients with an incident diagnosis of gout, and 39 766 age, sex and registered general practice-matched controls, all aged over 50 years and with no prior vascular history, in the 10 years following incidence of gout, or matched index date (baseline). Multivariable Cox Regression was used to estimate HRs and covariates included sex and baseline measures of age, Body Mass Index, smoking, alcohol consumption, Charlson comorbidity index, history of hypertension, hyperlipidaemia, chronic kidney disease, statin use and aspirin use. Results Multivariable analysis showed men were at increased risk of any vascular event (HRs (95% CIs)) HR 1.06 (1.01 to 1.12), any CHD HR 1.08 (1.01 to 1.15) and PVD HR 1.18 (1.01 to 1.38), while women were at increased risk of any vascular event, HR 1.25 (1.15 to 1.35), any CHD HR 1.25 (1.12 to 1.39), and PVD 1.89 (1.50 to 2.38)) but not any CVD. Conclusions In this cohort of over 50s with gout, female patients with gout were at greatest risk of incident vascular events, even after adjustment for vascular risk factors, despite a higher prevalence of both gout and vascular disease in men. Further research is required to establish the reason for this sex difference.
    Annals of the Rheumatic Diseases 08/2014; 74(4). DOI:10.1136/annrheumdis-2014-205252 · 10.38 Impact Factor
Show more