Application of Information Retrieval Approaches to Case Classification in the Vaccine Adverse Event Reporting System

Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research (CBER), US FDA, Woodmont Office Complex 1, Rm 306N, 1401 Rockville Pike, Rockville, MD, 20852, USA, .
Drug Safety (Impact Factor: 2.82). 05/2013; 36(7). DOI: 10.1007/s40264-013-0064-4
Source: PubMed


BACKGROUND: Automating the classification of adverse event reports is an important step to improve the efficiency of vaccine safety surveillance. Previously we showed it was possible to classify reports using features extracted from the text of the reports. OBJECTIVE: The aim of this study was to use the information encoded in the Medical Dictionary for Regulatory Activities (MedDRA(®)) in the US Vaccine Adverse Event Reporting System (VAERS) to support and evaluate two classification approaches: a multiple information retrieval strategy and a rule-based approach. To evaluate the performance of these approaches, we selected the conditions of anaphylaxis and Guillain-Barré syndrome (GBS). METHODS: We used MedDRA(®) Preferred Terms stored in the VAERS, and two standardized medical terminologies: the Brighton Collaboration (BC) case definitions and Standardized MedDRA(®) Queries (SMQ) to classify two sets of reports for GBS and anaphylaxis. Two approaches were used: (i) the rule-based instruments that are available by the two terminologies (the Automatic Brighton Classification [ABC] tool and the SMQ algorithms); and (ii) the vector space model. RESULTS: We found that the rule-based instruments, particularly the SMQ algorithms, achieved a high degree of specificity; however, there was a cost in terms of sensitivity in all but the narrow GBS SMQ algorithm that outperformed the remaining approaches (sensitivity in the testing set was equal to 99.06 % for this algorithm vs. 93.40 % for the vector space model). In the case of anaphylaxis, the vector space model achieved higher sensitivity compared with the best values of both the ABC tool and the SMQ algorithms in the testing set (86.44 % vs. 64.11 % and 52.54 %, respectively). CONCLUSIONS: Our results showed the superiority of the vector space model over the existing rule-based approaches irrespective of the standardized medical knowledge represented by either the SMQ or the BC case definition. The vector space model might make automation of case definitions for spontaneous report review more efficient than current rule-based approaches, allowing more time for critical assessment and decision making by pharmacovigilance experts.

24 Reads
  • Source
    • "One can assess the similarity of the reports to the case definition of anaphylaxis (as represented by the terms in Table 3) using the cosine measure that is easy to compute, albeit very simplistic and lacking inferential basis. Surprisingly, however, and in the case of anaphylaxis, cosine similarity works as well as the rule-based case definition using MedDRA PT's for classification of possible anaphylaxis reports in VAERS database [47]. We note that case definitions are also important in case- CBR, an artificial intelligence paradigm that has been used to build medical decision support systems [17]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Safety of medical products is a major public health concern. We present a critical discussion of the currently used analytical tools for mining spontaneous reporting systems (SRS) to identify safety signals after use of medical products. We introduce a pattern discovery framework for the analysis of SRS. The terminology ‘pattern discovery’ is borrowed from the engineering and artificial intelligence literature and signifies that the basis of the proposed framework is the medical case, formalizing the cognitive paradigm known to clinicians who evaluate individual patients and individual case safety reports submitted to SRS. The fundamental contribution of this approach is a strong probabilistic component that may account for selection and other biases and facilitates rigorous modeling and inference. We discuss somewhat in depth the concept of signal in pharmacovigilance and connect it with the concept of a pattern; we illustrate this conceptual framework using the example of anaphylaxis. Finally, we propose a research agenda in statistics, informatics, and pharmacovigilance practices needed to advance the pattern discovery framework in both the short and long terms.
    Statistical Analysis and Data Mining 10/2014; 7(5). DOI:10.1002/sam.11233
  • Source
    • "The Vaccine Adverse Event Reporting System (VAERS) is an example of a national passive surveillance method used for detecting adverse events in USA. VAERS allows direct reporting by members of the public and utilised automated methods for classifying adverse events reported to it [7]. However, such reporting systems vary between countries, collecting data in non-standard ways. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Immunisation is an important part of health care and adverse events following immunisation (AEFI) are relatively rare. AEFI can be detected through long term follow up of a cohort or from looking for signals from real world, routine data; from different health systems using a variety of clinical coding systems. Mapping these is a challenging aspect of integrating data across borders. Ontological representations of clinical concepts provide a method to map similar concepts, in this case AEFI across different coding systems. We describe a method using ontologies to be flag definite, probable or possible cases. We use Guillain-Barre syndrome (GBS) as an AEFI to illustrate this method, and the Brighton collaboration's case definition of GBS as the gold standard. Our method can be used to flag definite, probable or possible cases of GBS. Whilst there has been much research into the use of ontologies in immunisation these have focussed on database interrogation; where ours looks to identify varying signal strength.
    Studies in health technology and informatics 04/2014; 197:15-9.
  • [Show abstract] [Hide abstract]
    ABSTRACT: We previously demonstrated that a general purpose text mining system, the Vaccine adverse event Text Mining (VaeTM) system, could be used to automatically classify reports of an-aphylaxis for post-marketing safety surveillance of vaccines. To evaluate the ability of VaeTM to classify reports to the Vaccine Adverse Event Reporting System (VAERS) of possible Guillain-Barré Syndrome (GBS). We used VaeTM to extract the key diagnostic features from the text of reports in VAERS. Then, we applied the Brighton Collaboration (BC) case definition for GBS, and an information retrieval strategy (i.e. the vector space model) to quantify the specific information that is included in the key features extracted by VaeTM and compared it with the encoded information that is already stored in VAERS as Medical Dictionary for Regulatory Activities (MedDRA) Preferred Terms (PTs). We also evaluated the contribution of the primary (diagnosis and cause of death) and secondary (second level diagnosis and symptoms) diagnostic VaeTM-based features to the total VaeTM-based information. MedDRA captured more information and better supported the classification of reports for GBS than VaeTM (AUC: 0.904 vs. 0.777); the lower performance of VaeTM is likely due to the lack of extraction by VaeTM of specific laboratory results that are included in the BC criteria for GBS. On the other hand, the VaeTM-based classification exhibited greater specificity than the MedDRA-based approach (94.96% vs. 87.65%). Most of the VaeTM-based information was contained in the secondary diagnostic features. For GBS, clinical signs and symptoms alone are not sufficient to match MedDRA coding for purposes of case classification, but are preferred if specificity is the priority.
    Applied Clinical Informatics 05/2013; 4(1):88-99. DOI:10.4338/ACI-2012-11-RA-0049 · 0.39 Impact Factor
Show more