Identifying QT prolongation from ECG impressions using natural language processing and negation detection.

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, USA.
Studies in health technology and informatics 02/2007; 129(Pt 2):1283-8. DOI: 10.3233/978-1-58603-774-1-1283
Source: PubMed

ABSTRACT Electrocardiogram (ECG) impressions provide significant information for decision support and clinical research. We investigated the presence of QT prolongation, an important risk factor for sudden cardiac death, compared to the automated calculation of corrected QT (QTc) by ECG machines. We integrated a negation tagging algorithm into the KnowledgeMap concept identifier (KMCI), then applied it to impressions from 44,080 ECGs to identify Unified Medical Language System concepts. We compared the instances of QT prolongation identified by KMCI to the calculated QTc. The algorithm for negation detection had a recall of 0.973 and precision of 0.982 over 10,490 concepts. A concept query for QT prolongation matched 2,364 ECGs with precision of 1.00. The positive predictive value of the common QTc cutoffs was 6-21%. ECGs not identified by KMCI as prolonged but with QTc>450ms revealed potential causes of miscalculated QTc intervals in 96% of the cases; no definite concept query false negatives were detected. We conclude that a natural language processing system can effectively identify QT prolongation and other cardiac diagnoses from ECG impressions for potential decision support and clinical research.

  • [Show abstract] [Hide abstract]
    ABSTRACT: The majority of clinical symptoms are stored as free text in the clinical record, and this information can inform clinical decision support and automated surveillance efforts if it can be accurately processed into computer interpretable data. We developed rule-based algorithms and evaluated a natural language processing (NLP) system for infectious symptom detection using clinical narratives. Training (60) and testing (444) documents were randomly selected from VA emergency department, urgent care, and primary care records. Each document was processed with NLP and independently manually reviewed by two clinicians with adjudication by referee. Infectious symptom detection rules were developed in the training set using keywords and SNOMED-CT concepts, and subsequently evaluated using the testing set. Overall symptom detection performance was measured with a precision of 0.91, a recall of 0.84, and an F measure of 0.87. Overall symptom detection with assertion performance was measured with a precision of 0.67, a recall of 0.62, and an F measure of 0.64. Among those instances in which the automated system matched the reference set determination for symptom, the system correctly detected 84.7% of positive assertions, 75.1% of negative assertions, and 0.7% of uncertain assertions. This work demonstrates how processed text could enable detection of non-specific symptom clusters for use in automated surveillance activities.
    International Journal of Medical Informatics 03/2012; 81(3):143-56. DOI:10.1016/j.ijmedinf.2011.11.005 · 2.72 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: OBJECTIVE: Medication safety requires that each drug be monitored throughout its market life as early detection of adverse drug reactions (ADRs) can lead to alerts that prevent patient harm. Recently, electronic medical records (EMRs) have emerged as a valuable resource for pharmacovigilance. This study examines the use of retrospective medication orders and inpatient laboratory results documented in the EMR to identify ADRs. METHODS: Using 12 years of EMR data from Vanderbilt University Medical Center (VUMC), we designed a study to correlate abnormal laboratory results with specific drug administrations by comparing the outcomes of a drug-exposed group and a matched unexposed group. We assessed the relative merits of six pharmacovigilance measures used in spontaneous reporting systems (SRSs): proportional reporting ratio (PRR), reporting OR (ROR), Yule's Q (YULE), the χ(2) test (CHI), Bayesian confidence propagation neural networks (BCPNN), and a gamma Poisson shrinker (GPS). RESULTS: We systematically evaluated the methods on two independently constructed reference standard datasets of drug-event pairs. The dataset of Yoon et al contained 470 drug-event pairs (10 drugs and 47 laboratory abnormalities). Using VUMC's EMR, we created another dataset of 378 drug-event pairs (nine drugs and 42 laboratory abnormalities). Evaluation on our reference standard showed that CHI, ROR, PRR, and YULE all had the same F score (62%). When the reference standard of Yoon et al was used, ROR had the best F score of 68%, with 77% precision and 61% recall. CONCLUSIONS: Results suggest that EMR-derived laboratory measurements and medication orders can help to validate previously reported ADRs, and detect new ADRs.
    Journal of the American Medical Informatics Association 11/2012; 20(3). DOI:10.1136/amiajnl-2012-001119 · 3.93 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: OBJECTIVE: To create a computable MEDication Indication resource (MEDI) to support primary and secondary use of electronic medical records (EMRs). MATERIALS AND METHODS: We processed four public medication resources, RxNorm, Side Effect Resource (SIDER) 2, MedlinePlus, and Wikipedia, to create MEDI. We applied natural language processing and ontology relationships to extract indications for prescribable, single-ingredient medication concepts and all ingredient concepts as defined by RxNorm. Indications were coded as Unified Medical Language System (UMLS) concepts and International Classification of Diseases, 9th edition (ICD9) codes. A total of 689 extracted indications were randomly selected for manual review for accuracy using dual-physician review. We identified a subset of medication-indication pairs that optimizes recall while maintaining high precision. RESULTS: MEDI contains 3112 medications and 63 343 medication-indication pairs. Wikipedia was the largest resource, with 2608 medications and 34 911 pairs. For each resource, estimated precision and recall, respectively, were 94% and 20% for RxNorm, 75% and 33% for MedlinePlus, 67% and 31% for SIDER 2, and 56% and 51% for Wikipedia. The MEDI high-precision subset (MEDI-HPS) includes indications found within either RxNorm or at least two of the three other resources. MEDI-HPS contains 13 304 unique indication pairs regarding 2136 medications. The mean±SD number of indications for each medication in MEDI-HPS is 6.22±6.09. The estimated precision of MEDI-HPS is 92%. CONCLUSIONS: MEDI is a publicly available, computable resource that links medications with their indications as represented by concepts and billing codes. MEDI may benefit clinical EMR applications and reuse of EMR data for research.
    Journal of the American Medical Informatics Association 04/2013; 20(5). DOI:10.1136/amiajnl-2012-001431 · 3.93 Impact Factor