About
50
Publications
14,817
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
782
Citations
Introduction
Tong Wang currently works at the Department of Business Analytics, University of Iowa. Tong does research in Interpretable Machine Learning, machine learning applied to healthcare problems and business problems.
Additional affiliations
August 2010 - June 2016
Publications
Publications (50)
Medical crowdfunding is a popular channel for people seeking financial assistance to cover their medical expenses, allowing them to collect donations from large numbers of donors. However, a mismatch between the supply and demand of donations creates large heterogeneity in the fundraising outcomes across medical crowdfunding campaigns and such unce...
The demand for same-day delivery (SDD) has increased rapidly in the last few years and has particularly boomed during the COVID-19 pandemic. The fast growth is not without its challenge. In 2016, due to low concentrations of memberships and far distance from the depot, certain minority neighborhoods were excluded from receiving Amazon’s SDD service...
Children with orofacial clefting (OFC) present with a wide range of dental anomalies. Identifying these anomalies is vital to understand their etiology and to discern the complex phenotypic spectrum of OFC. Such anomalies are currently identified using intra-oral exams by dentists, a costly and time-consuming process. We claim that automating the p...
We develop a novel interpretable machine learning model, GANNM, and use newly available data to evaluate how different types of marketing campaigns and budget allocations influence malls’ customer traffic. We observe that the response curves that measure the impact of campaign budget on customer traffic differ for different categories of campaigns,...
Children with orofacial clefting (OFC) present with a wide range of dental anomalies. Identifying these anomalies is vital to understand their etiology and to discern the complex phenotypic spectrum of OFC. Such anomalies are currently identified using intra-oral exams by dentists, a costly and time-consuming process. We claim that automating the p...
Lending decisions are usually made with proprietary models that provide minimally acceptable explanations to users. In a future world without such secrecy, what decision support tools would one want to use for justified lending decisions? This question is timely, since the economy has dramatically shifted due to a pandemic, and a massive number of...
Lending decisions are usually made with proprietary models that provide minimally acceptable explanations to users. In a future world without such secrecy, what decision support tools would one want to use for justified lending decisions? This question is timely, since the economy has dramatically shifted due to a pandemic, and a massive number of...
We propose Partially Interpretable Estimators (PIE) which attribute a prediction to individual features via an interpretable model, while a (possibly) small part of the PIE prediction is attributed to the interaction of features via a black-box model, with the goal to boost the predictive performance while maintaining interpretability. As such, the...
We propose a model-agnostic approach for mitigating the prediction bias of a black-box decision-maker, and in particular, a human decision-maker. Our method detects in the feature space where the black-box decision-maker is biased and replaces it with a few short decision rules, acting as a "fair surrogate". The rule-based surrogate model is traine...
The demand for same-day delivery (SDD) has increased rapidly in the last few years and has particularly boomed during the COVID-19 pandemic. Existing literature on the problem has focused on maximizing the utility, represented as the total number of expected requests served. However, a utility-driven solution results in unequal opportunities for cu...
We propose a novel interpretable recurrent neural network (RNN) model, called ProtoryNet, in which we introduce a new concept of prototype trajectories. Motivated by the prototype theory in modern linguistics, ProtoryNet makes a prediction by finding the most similar prototype for each sentence in a text sequence and feeding an RNN backbone with th...
We present an interpretable \emph{companion} model for any pre-trained black-box classifiers. The idea is that for any input, a user can decide to either receive a prediction from the black-box model, with high accuracy but no explanations, or employ a \emph{companion rule} to obtain an interpretable prediction with slightly lower accuracy. The com...
Medical crowdfunding is a popular channel for people needing financial help paying medical bills to collect donations from large numbers of people. However, large heterogeneity exists in donations across cases, and fundraisers face significant uncertainty in whether their crowdfunding campaigns can meet fundraising goals. Therefore, it is important...
Medical crowdfunding has seen rapid growth in recent years and it has become a popular channel for people needing financial help. However, there exists large heterogeneity in donations across cases and fundraisers face significant uncertainty in whether their crowdfunding campaigns can meet fundraising goals. We aim to develop novel algorithms to p...
Driven by an increasing need for model interpretability, interpretable models have become strong competitors for black-box models in many real applications. In this paper, we propose a novel type of model where interpretable models compete and collaborate with black-box models. We present the Model-Agnostic Linear Competitors (MALC) for partially i...
Objectives
The use of peripherally inserted central catheters (PICCs) are an integral part of caring for hospitalised children. We sought to estimate the incidence of and identify the risk factors for complications associated with PICCs in an advanced registered nurse practitioners (ARNP)-driven programme.
Design
Retrospective cohort study.
Setti...
This work addresses the situation where a black-box model with good predictive performance is chosen over its interpretable competitors, and we show interpretability is still achievable in this case. Our solution is to find an interpretable substitute on a subset of data where the black-box model is overkill or nearly overkill while leaving the res...
Interpretable machine learning has become a strong competitor for traditional black-box models. However, the possible loss of the predictive performance for gaining interpretability is often inevitable, putting practitioners in a dilemma of choosing between high accuracy (black-box models) and interpretability (interpretable models). In this work,...
Objective
The American College of Critical Care Medicine recommends that children with persistent fluid, catecholamine, and hormone-resistant septic shock be considered for extracorporeal membrane oxygenation (ECMO) support. Current national estimates of ECMO use in hospitalized children with sepsis are unknown. We sought to examine the use of ECMO...
We present the Multi-value Rule Set (MRS) for interpretable classification with feature efficient presentations. Compared to rule sets built from single-value rules, MRS adopts a more generalized form of association rules that allows multiple values in a condition. Rules of this form are more concise than classical single-value rules in capturing a...
We propose a possible solution to a public challenge posed by the Fair Isaac Corporation (FICO), which is to provide an explainable model for credit risk assessment. Rather than present a black box model and explain it afterwards, we provide a globally interpretable model that is as accurate as other neural networks. Our "two-layer additive risk mo...
We propose a Multi-vAlue Rule Set (MRS) model for in-hospital predicting patient mortality. Compared to rule sets built from single-valued rules, MRS adopts a more generalized form of association rules that allows multiple values in a condition. Rules of this form are more concise than classical single-valued rules in capturing and describing patte...
Interpretable machine learning models have received increasing interest in recent years, especially in domains where humans are involved in the decision-making process. However, the possible loss of the task performance for gaining interpretability is often inevitable. This performance downgrade puts practitioners in a dilemma of choosing between a...
Circular dichroism spectra of free and MNP-encapsulated HER2-NCApt.
Notes: The concentration of NCApt is 1 μM. Both free and MNP-encapsulated NCApt exhibited positive peaks at 190 and 280 nm and negative peaks at 210 and 250 nm. The stronger peak signals associated with MNP-encapsulated NCApt suggest that NCApt was successfully encapsulated and con...
Representative fluorescent microscopy images of the uptake of Texas red-labeled oligonucleotide-MNPs by HeLa cells.
Notes: Texas red signals were mainly localized inside the cells. All scale bars are 100 μm.
Abbreviations: HApt, human epidermal growth factor receptor 2 aptamer; MNPs, micelle-like nanoparticles; NCApt, negative control aptamer; nt,...
We introduce a novel generative model for interpretable subgroup analysis for causal inference applications, Causal Rule Sets (CRS). A CRS model uses a small set of short rules to capture a subgroup where the average treatment effect is elevated compared to the entire population. We present a Bayesian framework for learning a causal rule set. The B...
We present the Multi-vAlue Rule Set (MARS) model for interpretable classification with feature efficient presentations. MARS introduces a more generalized form of association rules that allows multiple values in a condition. Rules of this form are more concise than traditional single-valued rules in capturing and describing patterns in data. MARS m...
We present a machine learning algorithm for building classifiers that are comprised of a \emph{small} number of \emph{short} rules. These are restricted disjunctive normal form models. An example of a classifier of this form is as follows: \emph{If} $X$ satisfies (condition $A$ AND condition $B$) OR (condition $C$) OR $\cdots$, \emph{then} $Y=1$. M...
A Rule Set model consists of a small number of short rules for interpretable classification, where an instance is classified as positive if it satisfies at least one of the rules. The rule set provides reasons for predictions, and also descriptions of a particular class. We present a Bayesian framework for learning Rule Set models, with prior param...
Or's of And's (OA) models are comprised of a small number of disjunctions of
conjunctions, also called disjunctive normal form. An example of an OA model is
as follows: If ($x_1 = $ `blue' AND $x_2=$ `middle') OR ($x_1 = $ `yellow'),
then predict $Y=1$, else predict $Y=0$. Or's of And's models have the advantage
of being interpretable to human expe...
One of the most challenging problems facing crime analysts is that of identifying crime series, which are sets of crimes committed by the same individual or group. Detecting crime series can be an important step in predictive policing, as knowledge of a pattern can be of paramount importance toward finding the offenders or stopping the pattern. Cur...
From a user perspective in modern data centers, temporary data unavailability dominates permanent data unavailability. Complementing data center repair research, this paper focuses on this temporary unavailability. To help manage this traffic-induced unavailability, we propose a coded resolution-aware storage (CRS) scheme for data center video-stre...
Our goal is to automatically detect patterns of crime. Among
a large set of crimes that happen every year in a major city, it is challenging,
time-consuming, and labor-intensive for crime analysts to determine
which ones may have been committed by the same individual(s). If automated,
data-driven tools for crime pattern detection are made available...
In this paper, we show that coding can be used in storage area networks
(SANs) to improve various quality of service metrics under normal SAN operating
conditions, without requiring additional storage space. For our analysis, we
develop a model which captures modern characteristics such as constrained I/O
access bandwidth limitations. Using this mo...
Many crimes can happen every day in a major city, and figuring out which ones are committed by the same individual or group is an important and difficult data mining challenge. To do this, we propose a pattern detection algorithm called Series Finder, that grows a pattern of discovered crimes from within a database, starting from a "seed" of a few...
In this paper, we study the joint design of multi-resolution (MR) coding and network coding. In the network coding model, we present two coding schemes, intra-layer and inter-layer network coding and use two elements as design parameters: 1) redundancy at different layer; 2) whether to code within a layer or across layers. Two metrics, average dist...
In this paper, we study the design of network coding for the multiple two-way relaying channels where multiple pairs of sources wish to exchange information with their partners. All nodes are equipped with multiple antennas, and we focus on a particular scenario where the sources have less antennas than the relay. For such a case, the application o...
Network coding is a promising technique to improve the performance of wireless networks. In this paper, we establish the hexagram polling network model and propose a network coding scheme applied in this network. Based on the fact that one of the lost data blocks in a certain polling round may be recovered by the network coded data block, we establ...