Clinical Diagnostics in Human Genetics with Semantic Similarity Searches in Ontologies

Institute for Medical Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany.
The American Journal of Human Genetics (Impact Factor: 10.99). 10/2009; 85(4):457-64. DOI: 10.1016/j.ajhg.2009.09.003
Source: PubMed

ABSTRACT The differential diagnostic process attempts to identify candidate diseases that best explain a set of clinical features. This process can be complicated by the fact that the features can have varying degrees of specificity, as well as by the presence of features unrelated to the disease itself. Depending on the experience of the physician and the availability of laboratory tests, clinical abnormalities may be described in greater or lesser detail. We have adapted semantic similarity metrics to measure phenotypic similarity between queries and hereditary diseases annotated with the use of the Human Phenotype Ontology (HPO) and have developed a statistical model to assign p values to the resulting similarity scores, which can be used to rank the candidate diseases. We show that our approach outperforms simpler term-matching approaches that do not take the semantic interrelationships between terms into account. The advantage of our approach was greater for queries containing phenotypic noise or imprecise clinical descriptions. The semantic network defined by the HPO can be used to refine the differential diagnosis by suggesting clinical features that, if present, best differentiate among the candidate diagnoses. Thus, semantic similarity searches in ontologies represent a useful way of harnessing the semantic structure of human phenotypic abnormalities to help with the differential diagnosis. We have implemented our methods in a freely available web application for the field of human Mendelian disorders.

Download full-text


Available from: Peter Krawitz, Aug 18, 2015
1 Follower
    • "Up till now there does not exist a valid, generalpurpose definition of similarity measure. There do exist several special-purpose definitions which have been employed with success in cluster analysis [31] [33], search [2] [3], classification [11] [30] [35], recognition [28] [36] and diagnostics [1] [34]. In this section, first we give a notion of integral of intuitionistic fuzzy sets and then propose a new form of intuitionistic fuzzy implication , inclusion and give a similarity measure. "
    [Show abstract] [Hide abstract]
    ABSTRACT: First we give notion of integral of intuitionistic fuzzy set and introduce intuitionistic fuzzy implicator and intuitionistic fuzzy inclusion measure. Then we propose a new measure of similarity between two intuitionistic fuzzy sets based on intuitionistic fuzzy inclusion measure. Examples are given to illustrate our notion and the application of the this new similarity measure in multi-criteria decision making.
    Journal of Intelligent and Fuzzy Systems 02/2016; · 0.94 Impact Factor
  • Source
    • "Moreover, most of the systems are designed to make a prediction about a specific disease or a class of diseases. Phenomizer is a web-based system that produces a ranked list of hereditary diseases for a set of clinical features [6]. This system only measures the structural similarity of phenotypes between query and diseases using Human-Phenotype- Ontology(HPO) [7] by developing a statistical model to assign p values to the resulting similarity scores, which can be used to rank the candidate diseases. "
    [Show abstract] [Hide abstract]
    ABSTRACT: With the availability of the huge medical knowledge data on the Internet such as the human disease network, protein-protein interaction (PPI) network, and phenotypegene, gene-disease bipartite networks, it becomes practical to help doctors by suggesting plausible hereditary diseases for a set of clinical phenotypes. However, identifying candidate diseases that best explain a set of clinical phenotypes by considering various heterogeneous networks is still a challenging task. In this paper, we propose a new method for estimating a ranked list of plausible diseases by associating phenotypegene with gene-disease bipartite networks. Our approach is to count the frequency of all the paths from a phenotype to a disease through their associated causative genes, and link the phenotype to the disease with path frequency in a new phenotype-disease bipartite (PDB) network. After that, we generate the candidate weights for the edges of phenotypes with diseases in PDB network. We evaluate our proposed method in terms of Normalized Discounted Cumulative Gain (NDCG), and demonstrate that we outperform the previously known disease ranking method called Phenomizer.
    Conference proceedings: ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Conference 07/2013; 2013:3475-3478. DOI:10.1109/EMBC.2013.6610290
  • Source
    • "Diseases with less than three annotations were excluded at this step. Afterward, we calculated the semantic similarity between all pairs of diseases, using the symmetric similarity measure as described for the Phenomizer [Köhler et al., 2009]. At first, the similarity between two terms of the HPO has to be defined. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Neurological disorders comprise one of the largest groups of human diseases. Due to the myriad symptoms and the extreme degree of clinical variability characteristic of many neurological diseases, the differential diagnosis process is extremely challenging. Even though most neurogenetic diseases are individually rare, collectively, the subgroup of neurogenetic disorders is large, comprising more than 2,400 different disorders. Recently, increasing efforts have been undertaken to unravel the molecular basis of neurogenetic diseases and to correlate pathogenetic mechanisms with clinical signs and symptoms. In order to enable computer-based analyses, the systematic representation of the neurological phenotype is of major importance. We demonstrate how the Human Phenotype Ontology (HPO) can be incorporated into these efforts by providing a systematic semantic representation of phenotypic abnormalities encountered in human genetic diseases. The combination of the HPO together with the Orphanet disease classification represents a promising resource for automated disease classification, performing computational clustering and analysis of the neurogenetic phenome. Furthermore, standardized representations of neurologic phenotypic abnormalities employing the HPO link neurological phenotypic abnormalities to anatomical and functional entities represented in other biomedical ontologies through the semantic references provided by the HPO.
    Human Mutation 09/2012; 33(9):1333-9. DOI:10.1002/humu.22112 · 5.05 Impact Factor
Show more