A Text Processing Pipeline to Extract Recommendations from Radiology Reports.

Biomedical & Health Informatics, School of Medicine, University of Washington, Seattle, WA
Journal of Biomedical Informatics (Impact Factor: 2.19). 01/2013; 46(2). DOI: 10.1016/j.jbi.2012.12.005
Source: PubMed


Communication of follow-up recommendations when abnormalities are identified on imaging studies is prone to error. The absence of an automated system to identify and track radiology recommendations is an important barrier to ensuring timely follow-up of patients especially with non-acute incidental findings on imaging examinations. In this paper, we present a text processing pipeline to automatically identify clinically important recommendation sentences in radiology reports. Our extraction pipeline is based on natural language processing (NLP) and supervised text classification methods. To develop and test the pipeline, we created a corpus of 800 radiology reports double annotated for recommendation sentences by a radiologist and an internist. We ran several experiments to measure the impact of different feature types and the data imbalance between positive and negative recommendation sentences. Our fully statistical approach achieved the best f-score 0.758 in identifying the critical recommendation sentences in radiology reports.

7 Reads
  • [Show abstract] [Hide abstract]
    ABSTRACT: Radiological measurements are one of the key variables in widely adopted guidelines (WHO, RECIST) that standardize and objectivize response assessment in oncology care. Measurements are typically described in free-text, narrative radiology reports. We present a natural language processing pipeline that extracts measurements from radiology reports and pairs them with extracted measurements from prior reports of the same clinical finding, e.g., lymph node or mass. A ground truth was created by manually pairing measurements in the abdomen CT reports of 50 patients. A Random Forest classifier trained on 15 features achieved superior results in an end-to-end evaluation of the pipeline on the extraction and pairing task: precision 0.910, recall 0.878, F-measure 0.894, AUC 0.988. Representing the narrative content in terms of UMLS concepts did not improve results. Applications of the proposed technology include data mining, advanced search and workflow support for healthcare professionals managing radiological measurements.
    AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium 01/2013; 2013:1262-71.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Information loss can occur between radiologists and patients with regard to incidental findings (unexpected or uncertain results) in the interpretation of an image. When a healthcare provider fails to inform a patient of a potential medical issue, quality of care is decreased and medical-legal issues arise. We discuss issues in modeling incidental findings in clinical records, examine available machine learning inputs, and propose a clinical text analysis system using weighted syntactic matching and user feedback learning. To demonstrate that our proposal would support better quality of care at lower cost than prior process-based solutions, we evaluate a prototype system on a gold-standard set of 580 records, yielding 82% sensitivity and 92% specificity, as compared with 43% sensitivity and 100% specificity for an existing manual review process.
    Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics; 09/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Natural Language Processing (NLP) has been shown effective to analyze the content of radiology reports and identify diagnosis or patient characteristics. We evaluate the combination of NLP and machine learning to detect thromboembolic disease diagnosis and incidental clinically relevant findings from angiography and venography reports written in French. We model thromboembolic diagnosis and incidental findings as a set of concepts, modalities and relations between concepts that can be used as features by a supervised machine learning algorithm. A corpus of 573 radiology reports was de-identified and manually annotated with the support of NLP tools by a physician for relevant concepts, modalities and relations. A machine learning classifier was trained on the dataset interpreted by a physician for diagnosis of deep-vein thrombosis, pulmonary embolism and clinically relevant incidental findings. Decision models accounted for the imbalanced nature of the data and exploited the structure of the reports. Results The best model achieved an F measure of 0.98 for pulmonary embolism identification, 1.00 for deep vein thrombosis, and 0.80 for incidental clinically relevant findings. The use of concepts, modalities and relations improved performances in all cases. Conclusions This study demonstrates the benefits of developing an automated method to identify medical concepts, modality and relations from radiology reports in French. An end-to-end automatic system for annotation and classification which could be applied to other radiology reports databases would be valuable for epidemiological surveillance, performance monitoring, and accreditation in French hospitals.
    BMC Bioinformatics 08/2014; 15(1):266. DOI:10.1186/1471-2105-15-266 · 2.58 Impact Factor
Show more


7 Reads
Available from