Integrating clinical research with the Healthcare Enterprise: from the RE-USE project to the EHR4CR platform.

INSERM, UMR_S 872 eq20, 15 rue de l'école de médecine, 75006 Paris, France.
Journal of Biomedical Informatics (Impact Factor: 2.13). 08/2011; 44 Suppl 1:S94-102. DOI: 10.1016/j.jbi.2011.07.007
Source: PubMed

ABSTRACT There are different approaches for repurposing clinical data collected in the Electronic Healthcare Record (EHR) for use in clinical research. Semantic integration of "siloed" applications across domain boundaries is the raison d'être of the standards-based profiles developed by the Integrating the Healthcare Enterprise (IHE) initiative - an initiative by healthcare professionals and industry promoting the coordinated use of established standards such as DICOM and HL7 to address specific clinical needs in support of optimal patient care. In particular, the combination of two IHE profiles - the integration profile "Retrieve Form for Data Capture" (RFD), and the IHE content profile "Clinical Research Document" (CRD) - offers a straightforward approach to repurposing EHR data by enabling the pre-population of the case report forms (eCRF) used for clinical research data capture by Clinical Data Management Systems (CDMS) with previously collected EHR data.
Implement an alternative solution of the RFD-CRD integration profile centered around two approaches: (i) Use of the EHR as the single-source data-entry and persistence point in order to ensure that all the clinical data for a given patient could be found in a single source irrespective of the data collection context, i.e. patient care or clinical research; and (ii) Maximize the automatic pre-population process through the use of a semantic interoperability services that identify duplicate or semantically-equivalent eCRF/EHR data elements as they were collected in the EHR context.
The RE-USE architecture and associated profiles are focused on defining a set of scalable, standards-based, IHE-compliant profiles that can enable single-source data collection/entry and cross-system data reuse through semantic integration. Specifically, data reuse is realized through the semantic mapping of data collection fields in electronic Case Report Forms (eCRFs) to data elements previously defined as part of patient care-centric templates in the EHR context. The approach was evaluated in the context of a multi-center clinical trial conducted in a large, multi-disciplinary hospital with an installed EHR.
Data elements of seven eCRFs used in a multi-center clinical trial were mapped to data elements of patient care-centric templates in use in the EHR at the George Pompidou hospital. 13.4% of the data elements of the eCRFs were found to be represented in EHR templates and were therefore candidate for pre-population. During the execution phase of the clinical study, the semantic mapping architecture enabled data persisted in the EHR context as part of clinical care to be used to pre-populate eCRFS for use without secondary data entry. To ensure that the pre-populated data is viable for use in the clinical research context, all pre-populated eCRF data needs to be first approved by a trial investigator prior to being persisted in a research data store within a CDMS.
Single-source data entry in the clinical care context for use in the clinical research context - a process enabled through the use of the EHR as single point of data entry, can - if demonstrated to be a viable strategy - not only significantly reduce data collection efforts while simultaneously increasing data collection accuracy secondary to elimination of transcription or double-entry errors between the two contexts but also ensure that all the clinical data for a given patient, irrespective of the data collection context, are available in the EHR for decision support and treatment planning. The RE-USE approach used mapping algorithms to identify semantic coherence between clinical care and clinical research data elements and pre-populate eCRFs. The RE-USE project utilized SNOMED International v.3.5 as its "pivot reference terminology" to support EHR-to-eCRF mapping, a decision that likely enhanced the "recall" of the mapping algorithms. The RE-USE results demonstrate the difficult challenges involved in semantic integration between the clinical care and clinical research contexts.

1 Bookmark
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Autism spectrum disorders (ASD) represent a group of developmental disabilities with a strong genetic basis. The laboratory mouse is increasingly used as a model organism for ASD, and MGI, the Mouse Genome Informatics resource, is the primary model organism ...
    Journal of Biomedical Informatics 12/2011; 44 Suppl 1:S54-5. · 2.13 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Medical documentation is a time-consuming task and there is a growing number of documentation requirements. In order to improve documentation, harmonization and standardization based on existing forms and medical concepts are needed. Systematic analysis of forms can contribute to standardization building upon new methods for automated comparison of forms. Objectives of this research are quantification and comparison of data elements for breast and prostate cancer to discover similarities, differences and reuse potential between documentation sets. In addition, common data elements for each entity should be identified by automated comparison of forms. A collection of 57 forms regarding prostate and breast cancer from quality management, registries, clinical documentation of two university hospitals (Erlangen, Münster), research datasets, certification requirements and trial documentation were transformed into the Operational Data Model (ODM). These ODM-files were semantically enriched with concept codes and analyzed with the compareODM algorithm. Comparison results were aggregated and lists of common concepts were generated. Grid images, dendrograms and spider charts were used for illustration. Overall, 1008 data elements for prostate cancer and 1232 data elements for breast cancer were analyzed. Average routine documentation consists of 390 data elements per disease entity and site. Comparisons of forms identified up to 20 comparable data elements in cancer conference forms from both hospitals. Urology forms contain up to 53 comparable data elements with quality management and up to 21 with registry forms. Urology documentation of both hospitals contains up to 34 comparable items with international common data elements. Clinical documentation sets share up to 24 comparable data elements with trial documentation. Within clinical documentation administrative items are most common comparable items. Selected common medical concepts are contained in up to 16 forms. The amount of documentation for cancer patients is enormous. There is an urgent need for standardized structured single source documentation. Semantic annotation is time-consuming, but enables automated comparison between different form types, hospital sites and even languages. This approach can help to identify common data elements in medical documentation. Standardization of forms and building up forms on the basis of coding systems is desirable. Several comparable data elements within the analyzed forms demonstrate the harmonization potential, which would enable better data reuse. Identifying common data elements in medical forms from different settings with systematic and automated form comparison is feasible.
    Journal of Biomedical Informatics 04/2014; · 2.13 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: OBJECTIVE: Logical Observation Identifiers Names and Codes (LOINC) mapping of laboratory data is often a question of the effort of mapping compared with the benefits of the structure achieved. The new LOINC mapping assistant RELMA (version 2011) has the potential to reduce the effort required for semi-automated mapping. We examined quality, time effort, and sustainability of such mapping. METHODS: To verify the mapping quality, two samples of 100 laboratory terms were extracted from the laboratory system of a German university hospital and processed in a semi-automated fashion with RELMA V.5 and LOINC V.2.34 German translation DIMDI to obtain LOINC codes. These codes were reviewed by two experts from each of two laboratories. Then all 2148 terms used in these two laboratories were processed in the same way. RESULTS: In the initial samples, 93 terms from one laboratory system and 92 terms from the other were correctly mapped. Of the total 2148 terms, 1660 could be mapped. An average of 500 terms per day or 60 terms per hour could be mapped. Of the laboratory terms used in 2010, 99% could be mapped. DISCUSSION: Semi-automated LOINC mapping of non-English laboratory terms has become promising in terms of effort and mapping quality using the new version RELMA V.5. The effort is probably lower than for previous manual mapping. The mapping quality equals that of manual mapping and is far better than that reported with previous automated mapping activities. CONCLUSION: RELMA V.5 and LOINC V.2.34 offer the opportunity to start thinking again about LOINC mapping even in non-English languages, since mapping effort is acceptable and mapping results equal those of previous manual mapping reports.
    Journal of the American Medical Informatics Association 07/2012; · 3.57 Impact Factor


Available from
May 29, 2014