Bridging the semantics gap between terminologies, ontologies, and information models.

Institute of Medical Biometry Medical Informatics, University Medical Center Freiburg, Germany.
Studies in health technology and informatics 01/2010; 160(Pt 2):1000-4.
Source: PubMed

ABSTRACT SNOMED CT and other biomedical vocabularies provide semantic identifiers for all kinds of linguistic expressions, many of which cannot be considered terms in a strict sense. We analyzed such "non-terms" in SNOMED CT and concluded that many of them cannot be interpreted as directly referring to objects or processes, but rather to information entities. Discussing two approaches to represent information entities, viz. the OBO Information artifact ontology (IAO) and the HL7 v3 Reference Information Model (RIM), we propose an integrative solution for representing information entities in SNOMED CT, in a way that is still compatible with RIM and the IAO and uses moderately enhanced description logics.

  • [Show abstract] [Hide abstract]
    ABSTRACT: Some modern Electronic Healthcare Record (EHR) architectures and standards are based on the dual model-based architecture, which defines two conceptual levels: reference model and archetype model. Such architectures represent EHR domain knowledge by means of archetypes, which are considered by many researchers to play a fundamental role for the achievement of semantic interoperability in healthcare. Consequently, formal methods for validating archetypes are necessary. In recent years, there has been an increasing interest in exploring how semantic web technologies in general, and ontologies in particular, can facilitate the representation and management of archetypes, including binding to terminologies, but no solution based on such technologies has been provided to date to validate archetypes. Our approach represents archetypes by means of OWL ontologies. This permits to combine the two levels of the dual model-based architecture in one modeling framework which can also integrate terminologies available in OWL format. The validation method consists of reasoning on those ontologies to find modeling errors in archetypes: incorrect restrictions over the reference model, non-conformant archetype specializations and inconsistent terminological bindings. The archetypes available in the repositories supported by the openEHR Foundation and the NHS Connecting for Health Program, which are the two largest publicly available ones, have been analyzed with our validation method. For such purpose, we have implemented a software tool called Archeck. Our results show that around 1/5 of archetype specializations contain modeling errors, the most common mistakes being related to coded terms and terminological bindings. The analysis of each repository reveals that different patterns of errors are found in both repositories. This result reinforces the need for making serious efforts in improving archetype design processes.
    Journal of Biomedical Informatics 12/2012; · 2.13 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The database of Genotypes and Phenotypes (dbGaP) contains various types of data generated from genome-wide association studies (GWAS). These data can be used to facilitate novel scientific discoveries and to reduce cost and time for exploratory research. However, idiosyncrasies and inconsistencies in phenotype variable names are a major barrier to reusing these data. We addressed these challenges in standardizing phenotype variables by formalizing their descriptions using Clinical Element Models (CEM). Designed to represent clinical data, CEMs were highly expressive and thus were able to represent a majority (77.5%) of the 215 phenotype variable descriptions. However, their high expressivity also made it difficult to directly apply them to research data such as phenotype variables in dbGaP. Our study suggested that simplification of the template models makes it more straightforward to formally represent the key semantics of phenotype variables.
    PLoS ONE 01/2013; 8(9):e76384. · 3.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: More and more detailed, complex and new data about the patient's health status as well as about medical knowledge become available. The synopsis of this heterogeneous, patient-customized information is crucial for physicians to make the correct diagnosis. The problem however is that these heterogeneous data are not semantically inte-grated. As a result most of the available data and knowledge is often not used in their full strength in clinical decisions. Semantic integration requires annotation of clinical data with concepts or codes from esta-blished domain ontologies covering medical and clinical knowledge such as the Foundational Model of Anatomy (FMA), SNOMED CT or the International Classification of Diseases (ICD). Further, an onto-logically well founded information model structuring the references to these ontologies is needed. Today's models of clinical informa-tion like the HL7 Reference Information Model however lack a well defined ontological foundation. The resulting ambiguities make it dif-ficult to map clinical data to their schema and to reuse clinical data stored in them. In this paper we present initial work on a Model for Clinical Information (MCI) based on the Ontology of General Medical Science (OGMS) and other OBO ontologies. MCI focuses on meta-information and high-level concepts with the aim to provide a basis for data integration and knowledge exploration.
    International Conference on Biomedical Ontology; 07/2013


Available from
Jun 5, 2014