Temporal disease trajectories condensed from population-wide registry data covering 6.2 million patients

Nature Communications (Impact Factor: 11.47). 06/2014; 5:4022. DOI: 10.1038/ncomms5022
Source: PubMed


A key prerequisite for precision medicine is the estimation of disease progression from the current patient state. Disease correlations and temporal disease progression (trajectories) have mainly been analysed with focus on a small number of diseases or using large-scale approaches without time consideration, exceeding a few years. So far, no large-scale studies have focused on defining a comprehensive set of disease trajectories. Here we present a discovery-driven analysis of temporal disease progression patterns using data from an electronic health registry covering the whole population of Denmark. We use the entire spectrum of diseases and convert 14.9 years of registry data on 6.2 million patients into 1,171 significant trajectories. We group these into patterns centred on a small number of key diagnoses such as chronic obstructive pulmonary disease (COPD) and gout, which are central to disease progression and hence important to diagnose early to mitigate the risk of adverse outcomes. We suggest such trajectory analyses may be useful for predicting and preventing future diseases of individual patients.

Download full-text


Available from: Anders Boeck Jensen, Mar 15, 2015
30 Reads
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Multiple diseases (acute or chronic events) occur together in a patient, which refers to the disease comorbidities, because of the multi ways associations among diseases. Due to shared genetic, molecular, environmental, and lifestyle-based risk factors, many diseases are comorbid in the same patient. Methods for integrating multiple types of omics data play an important role to identify integrative biomarkers for stratification of patients into groups with different clinical outcomes. Moreover, integrated omics and clinical information may potentially improve prediction accuracy of disease comorbidities. However, there is a lack of effective and efficient bioinformatics and statistical software for true integrative data analysis. With the availability of the wide spread huge omics, phenotype and ontology information, it is becoming more and more practical to help doctors in clinical diagnostics and comorbidity prediction by providing appropriate software tool. We developed an R software POGO to compute novel estimators of the disease comorbidity risks and patient stratification. Starting from an initial diagnosis, omics and clinical data of a patient the software identifies the association risk of disease comorbidities. The input of this software is the initial diagnosis of a patient and the output provides evidence of disease comorbidities. The functions of POGO offer flexibility for diagnostic applications to predict disease comorbidities, and can be easily integrated to high-throughput and clinical data analysis pipelines. POGO is compliant with the Bioconductor standard and it is freely available at
    Frontiers in Cell and Developmental Biology 06/2015; 3:28. DOI:10.3389/fcell.2015.00028
  • [Show abstract] [Hide abstract]
    ABSTRACT: Health care research focuses on the description and analysis of the health care system and its requirements. Research-derived innovations are the subject of trials and evaluation of the transfer to daily routine. For this purpose health care research has developed a broad theory-based spectrum of methods. On the other hand, the concept of big data is an new informatics-driven approach to large data sets independent of content. With its technical vocabulary the concept of big data does not easily fit into traditional health care research. Central tasks of health care research such as the generation of theories, norm-oriented evaluations or proof of causality can neither be supported nor replaced by big data. However, the concept of big data has the potential to support health care research, with traditional tasks such as data linkage, analysis of health care paths, quick access to up-to-date data on the distribution and acceptance of health care services, as well as prediction and the generation of hypotheses. The prerequisite for all this is a trust-based linkage of different medical and nonmedical data sources on the basis of the legal regulation of data access and data protection.
    Bundesgesundheitsblatt - Gesundheitsforschung - Gesundheitsschutz 06/2015; DOI:10.1007/s00103-015-2183-9 · 1.42 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Electronic health records (EHRs), data generated and collected during normal clinical care, are increasingly being linked and used for translational cardiovascular disease research. Electronic health record data can be structured (e.g. coded diagnoses) or unstructured (e.g. clinical notes) and increasingly encapsulate medical imaging, genomic and patient-generated information. Large-scale EHR linkages enable researchers to conduct high-resolution observational and interventional clinical research at an unprecedented scale. A significant amount of preparatory work and research, however, is required to identify, obtain, and transform raw EHR data into research-ready variables that can be statistically analysed. This study critically reviews the opportunities and challenges that EHR data present in the field of cardiovascular disease clinical research and provides a series of recommendations for advancing and facilitating EHR research.
    07/2015; 1(1). DOI:10.1093/ehjqcco/qcv005