The Human Phenotype Ontology: A Tool for Annotating and Analyzing Human Hereditary Disease

Institute for Medical Genetics, Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany.
The American Journal of Human Genetics (Impact Factor: 10.93). 11/2008; 83(5):610-5. DOI: 10.1016/j.ajhg.2008.09.017
Source: PubMed


There are many thousands of hereditary diseases in humans, each of which has a specific combination of phenotypic features, but computational analysis of phenotypic data has been hampered by lack of adequate computational data structures. Therefore, we have developed a Human Phenotype Ontology (HPO) with over 8000 terms representing individual phenotypic anomalies and have annotated all clinical entries in Online Mendelian Inheritance in Man with the terms of the HPO. We show that the HPO is able to capture phenotypic similarities between diseases in a useful and highly significant fashion.

Download full-text


Available from: Sebastian Köhler
  • Source
    • "The HPO [19] is a high dimensional feature space for representing the complexity of pathologies that are observed in human disease. Representing points in this space in a low-dimensional map is a difficult computational challenge. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Genome-wide data are increasingly important in the clinical evaluation of human disease. However, the large number of variants observed in individual patients challenges the efficiency and accuracy of diagnostic review. Recent work has shown that systematic integration of clinical phenotype data with genotype information can improve diagnostic workflows and prioritization of filtered rare variants. We have developed visually interactive, analytically transparent analysis software that leverages existing disease catalogs, such as the Online Mendelian Inheritance in Man database (OMIM) and the Human Phenotype Ontology (HPO), to integrate patient phenotype and variant data into ranked diagnostic alternatives. Our tool, “OMIM Explorer” (, extends the biomedical application of semantic similarity methods beyond those reported in previous studies. The tool also provides a simple interface for translating free-text clinical notes into HPO terms, enabling clinical providers and geneticists to contribute phenotypes to the diagnostic process. The visual approach uses semantic similarity with multidimensional scaling to collapse high-dimensional phenotype and genotype data from an individual into a graphical format that contextualizes the patient within a low-dimensional disease map. The map proposes a differential diagnosis and algorithmically suggests potential alternatives for phenotype queries—in essence, generating a computationally assisted differential diagnosis informed by the individual’s personal genome. Visual interactivity allows the user to filter and update variant rankings by interacting with intermediate results. The tool also implements an adaptive approach for disease gene discovery based on patient phenotypes. We retrospectively analyzed pilot cohort data from the Baylor Miraca Genetics Laboratory, demonstrating performance of the tool and workflow in the re-analysis of clinical exomes. Our tool assigned to clinically reported variants a median rank of 2, placing causal variants in the top 1 % of filtered candidates across the 47 cohort cases with reported molecular diagnoses of exome variants in OMIM Morbidmap genes. Our tool outperformed Phen-Gen, eXtasy, PhenIX, PHIVE, and hiPHIVE in the prioritization of these clinically reported variants. Our integrative paradigm can improve efficiency and, potentially, the quality of genomic medicine by more effectively utilizing available phenotype information, catalog data, and genomic knowledge.
    Preview · Article · Dec 2016 · Genome Medicine
  • Source
    • "We reused many different ontologies from the OBO library, and consequently patterns of these ontologies were reused. For example, the basic representation of a clinical finding is defined analogously to the representation of human phenotypes in the Human Phenotype Ontology (HP) [35, 36]. This has the advantage that clinical findings defined by MCI can be automatically classified in terms of HP such as splenomegaly, lymphadenopathy, etc. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background: In radiology, a vast amount of diverse data is generated, and unstructured reporting is standard. Hence, much useful information is trapped in free-text form, and often lost in translation and transmission. One relevant source of free-text data consists of reports covering the assessment of changes in tumor burden, which are needed for the evaluation of cancer treatment success. Any change of lesion size is a critical factor in follow-up examinations. It is difficult to retrieve specific information from unstructured reports and to compare them over time. Therefore, a prototype was implemented that demonstrates the structured representation of findings, allowing selective review in consecutive examinations and thus more efficient comparison over time. Methods: We developed a semantic Model for Clinical Information (MCI) based on existing ontologies from the Open Biological and Biomedical Ontologies (OBO) library. MCI is used for the integrated representation of measured image findings and medical knowledge about the normal size of anatomical entities. An integrated view of the radiology findings is realized by a prototype implementation of a ReportViewer. Further, RECIST (Response Evaluation Criteria In Solid Tumors) guidelines are implemented by SPARQL queries on MCI. The evaluation is based on two data sets of German radiology reports: An oncologic data set consisting of 2584 reports on 377 lymphoma patients and a mixed data set consisting of 6007 reports on diverse medical and surgical patients. All measurement findings were automatically classified as abnormal/normal using formalized medical background knowledge, i.e., knowledge that has been encoded into an ontology. A radiologist evaluated 813 classifications as correct or incorrect. All unclassified findings were evaluated as incorrect. Results: The proposed approach allows the automatic classification of findings with an accuracy of 96.4 % for oncologic reports and 92.9 % for mixed reports. The ReportViewer permits efficient comparison of measured findings from consecutive examinations. The implementation of RECIST guidelines with SPARQL enhances the quality of the selection and comparison of target lesions as well as the corresponding treatment response evaluation. Conclusions: The developed MCI enables an accurate integrated representation of reported measurements and medical knowledge. Thus, measurements can be automatically classified and integrated in different decision processes. The structured representation is suitable for improved integration of clinical findings during decision-making. The proposed ReportViewer provides a longitudinal overview of the measurements.
    Full-text · Article · Dec 2015 · BMC Medical Informatics and Decision Making
  • Source
    • "ontology [Robinson et al., 2008] terms HP0010708 and HP0010713. Sanger Sequencing "
    [Show abstract] [Hide abstract]
    ABSTRACT: Synpolydactyly (SPD) is a rare congenital limb disorder characterized by syndactyly between the third and fourth fingers and an additional digit in the syndactylous web. In most cases SPD is caused by heterozygous mutations in HOXD13 resulting in the expansion of a N-terminal polyalanine tract. If homozygous, the mutation results in severe shortening of all metacarpals and phalanges with a morphological transformation of metacarpals to carpals. Here, we describe a novel homozygous missense mutation in a family with unaffected consanguineous parents and severe brachydactyly and metacarpal-to-carpal transformation in the affected child. We performed whole exome sequencing on the index patient, followed by Sanger sequencing of parents and patient to investigate cosegregation. The DNA-binding ability of the mutant protein was tested with electrophoretic mobility shift assays. We demonstrate that the c.938C>G (p.313T>R) mutation in the DNA-binding domain of HOXD13 prevents binding to DNA in vitro. Our results show to our knowledge for the first time that a missense mutation in HOXD13 underlies severe brachydactyly with metacarpal-to-carpal transformation. The mutation is non-penetrant in heterozygous carriers. In conjunction with the literature we propose the possibility that the metacarpal-to-carpal transformation results from a homozygous loss of functional HOXD13 protein in humans in combination with an accumulation of non-functional HOXD13 that might be able to interact with other transcription factors in the developing limb. © 2015 Wiley Periodicals, Inc.
    Full-text · Article · Nov 2015 · American Journal of Medical Genetics Part A
Show more