Drug safety data mining with a tree-based scan statistic

Article (PDF Available)inPharmacoepidemiology and Drug Safety · May 2013with94 Reads
DOI: 10.1002/pds.3423 · Source: PubMed
In post-marketing drug safety surveillance, data mining can potentially detect rare but serious adverse events. Assessing an entire collection of drug–event pairs is traditionally performed on a predefined level of granularity. It is unknown a priori whether a drug causes a very specific or a set of related adverse events, such as mitral valve disorders, all valve disorders, or different types of heart disease. This methodological paper evaluates the tree-based scan statistic data mining method to enhance drug safety surveillance. We use a three-million-member electronic health records database from the HMO Research Network. Using the tree-based scan statistic, we assess the safety of selected antifungal and diabetes drugs, simultaneously evaluating overlapping diagnosis groups at different granularity levels, adjusting for multiple testing. Expected and observed adverse event counts were adjusted for age, sex, and health plan, producing a log likelihood ratio test statistic. Out of 732 evaluated disease groupings, 24 were statistically significant, divided among 10 non-overlapping disease categories. Five of the 10 signals are known adverse effects, four are likely due to confounding by indication, while one may warrant further investigation. The tree-based scan statistic can be successfully applied as a data mining tool in drug safety surveillance using observational data. The total number of statistical signals was modest and does not imply a causal relationship. Rather, data mining results should be used to generate candidate drug–event pairs for rigorous epidemiological studies to evaluate the individual and comparative safety profiles of drugs. Copyright
    • "TreeScan is a signal detection method that simultaneously looks for excess risk in any of a large number of individual cells in a database and in groups of closely related cells, formally adjusting the p-values for the multiple testing inherent in the large number of overlapping diagnosis groups evaluated [6,42,43]. The paper by Kulldorff et al. (2012) details the TreeScan approach for drug safety surveillance [42]. In brief, a hierarchical classification tree is first constructed for the outcomes where related diagnoses are close to each other on the tree. "
