PreprintPDF Available

Understanding the Opioid Epidemic: Human-Based Versus Algorithmic-Based Perceptions, Treatments, and Guidelines

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

As a major public health crisis, the opioid epidemic caused over 556,000 deaths in the U.S. between 2000 and 2020. To control the epidemic, the Centers for Disease Control and Prevention (CDC) has developed some general guidelines, encouraging physicians to use opioid medications only when their benefits outweigh their risks. The CDC's 2016 guidelines mainly left it to physicians to decide when the benefits outweigh the risks. A few years later (in 2022), the CDC made some modifications to make its recommendations a bit less reliant on each individual physician's perception of benefits versus risks. In complex and high stake decision-making environments such as those pertaining the use of opioid medications, it is not clear whether and how human-based perceptions might differ from algorithmic-based ones. In this study, we first develop some longitudinal machine learning algorithms (e.g., historical random forest, recurrent neural networks, and long short-term memory networks) and train them on clinical evidence of more than 3 million patients. We then feed the best machine learning algorithm to a mathematical model that enables determining cost-effective treatments for each patient in a personalized manner. Through extensive numerical experiments, we compare the treatment options and recommendations from our algorithmic-based approach with human-based ones that are currently followed in the medical practice. Compared to the human-based approach, our results show that the average saving in quality-adjusted life years and costs obtained by following our algorithmic-based treatments are about 2.82 days and $461.46 per patient per year. Finally, we make use of our findings and generate insights for policymakers as well as individual physicians into better ways of managing opioid prescriptions (and hence, the opioid epidemic) by incorporating and interacting with our algorithmic-based approach.
Understanding the Opioid Epidemic: Human-Based
Versus Algorithmic-Based Perceptions, Treatments,
and Guidelines
Alireza Boloori
Milgard School of Business, University of Washington, Tacoma, aboloori@uw.edu
Soroush Saghafian
Harvard Kennedy School, Harvard University, Soroush Saghafian@hks.harvard.edu
Stephen J. Traub
Department of Emergency Medicine, Brown University, stephen.traub@brownphysicians.org
As a major public health crisis, the opioid epidemic caused over 556,000 deaths in the U.S. between 2000
and 2020. To control the epidemic, the Centers for Disease Control and Prevention (CDC) has developed
some general guidelines, encouraging physicians to use opioid medications only when their benefits outweigh
their risks. The CDC’s 2016 guidelines mainly left it to physicians to decide when the benefits outweigh
the risks. A few years later (in 2022), the CDC made some modifications to make its recommendations a
bit less reliant on each individual physician’s perception of benefits versus risks. In complex and high stake
decision-making environments such as those pertaining the use of opioid medications, it is not clear whether
and how human-based perceptions might differ from algorithmic-based ones. In this study, we first develop
some longitudinal machine learning algorithms (e.g., historical random forest, recurrent neural networks,
and long short-term memory networks) and train them on clinical evidence of more than 3 million patients.
We then feed the best machine learning algorithm to a mathematical model that enables determining cost-
effective treatments for each patient in a personalized manner. Through extensive numerical experiments,
we compare the treatment options and recommendations from our algorithmic-based approach with human-
based ones that are currently followed in the medical practice. Compared to the human-based approach,
our results show that the average saving in quality-adjusted life years and costs obtained by following our
algorithmic-based treatments are about 2.82 days and
$
461.46 per patient per year. Finally, we make use
of our findings and generate insights for policymakers as well as individual physicians into better ways of
managing opioid prescriptions (and hence, the opioid epidemic) by incorporating and interacting with our
algorithmic-based approach.
Key words : Human vs. machine; opioid epidemic; pain management; personalized medicine; machine
learning
History : December 8, 2022
1. Introduction
According to the National Institute on Drug Abuse (NIDA), 915,515 drug-related deaths occurred
in the U.S. between 2000 and 2020, among which opioid analgesics (i.e., painkillers) were the main
contributing factor accounting for 556,472 deaths (60.78% of total deaths). These opioid painkillers
often result in patients switching to heroin or synthetic opioids (e.g., fentanyl), which, in turn,
caused additional 357,423 deaths during the same period (NIDA 2022). The economic cost of the
U.S. opioid epidemic was estimated to be
$
1,021 billion in 2017 (Luo et al. 2021), with about two
million people being either dependent on prescription opioids or abusing them (USA Today 2016).
To address this crisis, the Centers for Disease Control and Prevention (CDC) proposed a set of
guidelines in 2016 with the aim of reducing opioid prescriptions by clinicians (CDC 2016, Dowell et
al. 2016). The CDC guidelines, however, have been widely criticized for several reasons, including
1
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
2Article submitted to ; manuscript no.
the fact they did not make use of existing clinical evidence to provide a clear level of specificity. For
example, the American Medical Association (AMA) criticized CDC guidelines by stating that “we
continue to have serious concerns that some [of these guidelines] either contain a degree of specificity
not supported by the existing evidence or conflict with official Food and Drug Administration (FDA)-
approved product labeling for opioid analgesic products.” (AMA 2016)
In essence, the CDC guidelines heavily relied on human judgment by recommending that “Clin-
icians should consider opioid therapy only if expected benefits for both pain and function are antic-
ipated to outweigh risks to the patient” (Dowell et al. 2016). However, understanding when, and
for what patients, the benefits outweigh the risks is not an easy task for human clinicians. Further-
more, the lacked of specificity in the CDC recommendations would make them a “one-size-fits-all”
approach. For example, instead of providing recommendations tailored to each patient characteris-
tics, the CDC recommended fixed thresholds for all patients: no more than 90 morphine milligram
equivalent (MME) per day for the strength of opioids or no more than 7 days of supply for acute
pain (CDC 2016).1Lack of a clear treatment guideline tailored to each patient characteristics,
juxtaposed with human-based perceptions, have potentially led clinicians to show wide variations
in opioid prescriptions, which, in turn, is another contributing factor to the opioid epidemic (Bar-
nett et al. 2017). A main goal of our study is to use large-scale clinical evidence and develop an
analytics-driven algorithmic-based approach that can (1) help clinicians (as the human component
of the decision-making process) quantify when, and for what patients, the benefits of opioid pre-
scriptions outweighs their risks, and (2) allow policymakers to create recommendations that are
personalized (i.e, adapted to each patient’s characteristics) and not based on one-size-fits-all rules.
To this end, we make use of a large-scale claim data—MerativeTM MarketScan®Commercial
Dissertation Databases—which contains the history of medical encounters and prescribed medica-
tions of over 3 million patients. A summary of our data is shown in Table 1. From this data, we
retrieve information on the medical history of each individual patient, including diagnoses made by
providers, prescriptions of pharmacologic treatments (opioid and non-opioid medications), and use
of non-pharmacologic treatments (e.g., physical therapy and chiropractic). Training longitudinal
machine learning algorithms (e.g., historical random forest, recurrent neural networks, and long
short-term memory) on this data allows us to predict, based on each individual patient’s char-
acteristics, the risks of (1) opioid dependence, abuse, overdose, or death, and (2) pain remaining
untreated or undertreated. We refer to these risks as the risks of opioid use disorder (OUD) and
undertreated pain (UTP), respectively. We then compare these machine learning algorithms, and
feed the one that has the most accurate risk predictions to a mathematical model that allows
1In early 2022, the CDC delivered the first draft of a new set of guidelines, where the most notable change from the
2016 guidelines is the removal of the thresholds for the opioid dose and duration of supply. Despite this change, many
physicians, patients, and organizations have reservations against these guidelines, stating that the modified guidelines
still mostly focus on the harms of opioids rather than their benefits in avoiding poorly managed pain (NPR 2022).
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
Article submitted to ; manuscript no. 3
Table 1 Data summary (left: some of variables in the data, right: patient demographic, risk covariates, etc.)
Variable Data Table
A1I2S3O4D5R6
Patient ID
Age
Gender
Monthly Enrollment
Admission date
Discharge date
Discharge status
Service year
Service place
Service date start
Service date finish
Provider type
Diagnosis codes
Procedure codes
Procedure type
Drug name
Strength
Consumption method
Schedule type
Drug ID
Therapeutic class
Days supply
# units dispensed
# refills
Payments
1Annual insurance enrollment of patients. 2Inpatient
admissions. 3Inpatient services. 4Outpatient services.
5Outpatient pharmaceutical claims. 6Red book:
drugs general information.
Variable Average (S.D.)
# patients 3,013,637
# observations (visits & prescriptions) 25,340,400
# pain medication prescriptions 11,302,380
Demographics
Age (years) pre-supply 40.02 (15.76)
Fractiongender (female) 57.92% (0.03%)
Pain pre-supply
Fraction with acute (no chronic) pain 56.18% (0.03%)
Fraction with chronic (no acute) pain 42.12% (0.03%)
Risk: surgery or inpatient admission
Fraction with surgical procedures 77.16% (0.02%)
Fraction with inpatient admissions 22.87% (0.02%)
Risk: behavioral factors
Fraction with alcohol consumption 1.31% (0.01%)
Fraction with smoking 3.98% (0.01%)
Fraction with substance abuse 0.49% (0.00%)
Fraction with non-substance abuse 1.11% (0.01%)
Fraction with mental health disorder 55.45% (0.03%)
Fraction is out of the number of patients (3,013,637).
determining cost-effective multi-modal pain treatment (i.e., including both pharmacologic and non-
pharmacologic) considering both the benefits and risks of different treatment options. Despite the
high accuracy of the machine learning methods in risks predictions, their black-box nature, together
with the curse of dimensionality in the optimization model (due to considering multiple treatment
modes), may reduce the interpretability of our approach, making it less desirable to understand
and follow by clinicians and policymakers. Thus, to facilitate the adoption of our algorithmic-
based approach, we develop two heuristic solution methods that are (a) highly interpretable, and
(b) easily implementable in a medical decision-support system.
Overall, our contributions are two-fold:
(1) We develop an algorithmic-based approach capable of dynamically finding multi-modal treat-
ments personalized to individual patients based on their characteristics.
(2) By training our algorithm on large-scale clinical data, we provide important insights and impli-
cations for both policymakers and individual physicians:
By comparing the cost-effectiveness of our algorithmic-based treatment plans with the CDC 2016
and 2022 guidelines, we find that the modifications the CDC has made in its 2022 guidelines
would improve the cost-effectiveness of interventions adopted by physicians. However, we find
that the CDC 2022 guidelines are still not as effective as our algorithmic-based approach, and
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
4Article submitted to ; manuscript no.
hence, provide insights into ways the guidelines could be further improved. As the CDC is
planning to soon finalize the evaluation of the first draft of its 2022 guidelines (NPR 2022), we
hope our insights help the authorities to provide more effective recommendations.
Our results show that the CDC guidelines should also emphasize the harms of poorly managed
pain as much as they target avoiding the harms of using opioids. For example, we find that under
the current practice, for each patient experiencing OUD, there are about 24 patients with UTP.
The algorithmic-based treatment plans we obtain provide a much better balance between OUD
and UTP, and hence, can reduce this number to about 10.
Our algorithmic-based approach would improve the average quality-adjusted life years (QALYs)
gained and cost incurred compared to the human-based approach (in which a human-based judg-
ment is used to assess the potential benefits and risks) by 2.82 days and
$
461.46 per patient per
year, respectively. We also find that patients with acute pain, or those with a history of behav-
ioral factors (e.g., mental health disorders or substance abuse) or surgeries/inpatient admissions
would benefit most from the treatment plans we obtain.
Our results show that, compared to the current human-based approach, opioids could be pre-
scribed at a higher intensity for patients with no history of behavioral factors. Among this group,
the opioid dose can be tapered down faster for younger females (compared to older males) or
patients with a record of surgeries/inpatient admissions. Also, to balance the opioid dose being
tapered down, the duration of opioid medications should be increased after the onset of pain,
and then be decreased towards the end of the therapeutic course. Furthermore, in the presence
of behavioral factors or surgeries/inpatient admissions, older males can be prescribed with a
higher duration of opioid medications. However, in their absence, increasing the duration of such
medications is mainly useful only for younger females.
Finally, we find that the rate of using non-pharmacologic treatments (NPHTs), such as physical
therapy or chiropractic, needs to be increased. In particular, patients with acute pain could use
NPHTs for a few months after the onset of pain, while any record of behavioral factors would
elongate this period. For patients with chronic pain, however, NPHTs should be used over the
whole (only towards the end of) time horizon when there is a (is no) history of such factors.
The rest of this paper is organized as follows. In
§
2, we provide a review of relevant literature.
In
§
3, we discuss our data and study design. In
§
4, we present our analytical setting, including the
longitudinal machine learning models we have trained to predict the risks of OUD and UTP for
each patient, as well as a mathematical model we have developed to obtain cost-effective multi-
modal pain treatment plans. In
§
5, we present our numerical experiments and discuss relevant
insights and implications. In
§
6, we provide a summary of our policy recommendations. Finally, in
§
7, we conclude the paper and briefly discuss avenues for future research.
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
Article submitted to ; manuscript no. 5
2. Literature Review
2.1. Studies on Side Effects of Opioid Medications
A stream of literature related to our work focuses on establishing statistical associations between
opioid-related side effects and various risk factors such as age, gender, duration of supply, high dose
of opioid, and history of alcohol abuse, smoking, and mental health disorders (see, e.g., Cochran et
al. (2014) and Ciesielski et al. (2016)). For reviews of studies on the impact of opioid painkillers on
addiction, drug dependence, overdose, or death, one can refer to Fishbain et al. (2007) and Nuckols
et al. (2014).
Within this stream, some studies apply machine learning to address the opioid-related side effects.
Haller et al. (2016) employed Natural Language Processing techniques to predict risks of drug abuse
and addiction before a prescription is written. Che et al. (2017) made use of a Deep Feed-Forward
Neural Network to predict the possibility of long-term opioid use. Crosier et al. (2017) used random
forests to predict opioid overdose. Vunikili et al. (2018) utilized an Extreme Gradient Boosting
algorithm along with logistic regression to predict the risk of opioid abuse, overdose, and death.
Bjarnadottir et al. (2019) applied LASSO, adaptive boosting, and random forests to establish risk
factors associated with the chronic use of opioid painkillers. Compared to this stream, we not only
analyze the risk of opioid dependence, abuse, overdose, and death, but also measure the risk of
undertreated pain to account for potential benefits of pain treatments. Furthermore, the foregoing
studies are based on cross-sectional analyses (a patient’s record is gathered at a single point in
time without considering the dynamics of patient behavior or health information), whereas we take
alongitudinal approach. This allows us to study, not only for what patients, but also when the
benefits of opioid medications overcome their risks.
2.2. Studies on Measuring Pain and Efficacy of Pain Management
To measure the intensity of pain and evaluate the efficacy of managing it, there are evidence-
based pain assessment scales, such as verbal rating scales, numerical rating scales, visual analogue
scales, and behavioral pain scales (see, e.g., Huskisson (1974), Katz and Melzack (1999)). These
scales are typically evaluated using patient-based surveys, which are often impacted by patients’
satisfaction (see, e.g., Wells et al. (2008)). However, accounting for patients’ satisfaction is known
to inadvertently propell providers to prescribe opioid medications. Thus, the Centers for Medicare
and Medicaid Services has recently proceeded towards removing pain management questions from
the Hospital Consumer Assessment of Healthcare Providers and Systems survey (see, e.g., Boloori
et al. (2020b)). To characterize the efficacy of pain management in our study, we instead focus on
repeated visits due to an unresolved pain related condition, which indicates whether the condition
is being treated effectively (see, e.g., McPhillips-Tangum et al. (1998) and Xiao and Barber (2008)).
In particular, we make use of information from our claims data about repeated visits of patients
before and after each opioid prescription (for more details, see
§
3.3.2).
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
6Article submitted to ; manuscript no.
2.3. Relevant Operations Research/Management Science Studies
Pitt et al. (2018) proposed a dynamic compartmental model based on a combination of pain,
opioid use, and addiction status, and found that various resource allocation schemes, (e.g., sup-
plying naloxone as an opioid antagonist, supplying needles for addicted patients, and promoting
medication-assisted treatments), could have a positive long-term impact on patients’ quality of
life. Freeman et al. (2019) conducted an empirical study and showed that getting a second opinion
such as visiting another provider rather than a primary care provider for opioid prescription is
associated with a lower rate of long-term opioid use. Gan et al. (2019) analyzed the impact of wear-
able devices on detecting opioid use disorder via urine tests, and proposed a partially observable
Markov decision process model to optimize decisions on wearing these devices given various budget
and patient adherence considerations. A broader set of issues in optimizing decisions regarding
care delivery via wearables and mobile health (mHealth) devices is discussed in studies such as
Saghafian and Murphy (2021), and the references therein.
From a methodology standpoint, the Operations Research/Management Science (OR/MS) lit-
erature has applied supervised learning in optimization frameworks. For example, Bertsimas et al.
(2016) followed this approach to improve cancer chemotherapy regimens. However, this body of lit-
erature has primarily focused on single-period decision-making problems, where the machine learn-
ing is applied to cross-sectional data. Reinforcement Learning (RL) is another branch of machine
learning suited for modelling multi-period decision-making problems. For example, Saghafian
(2022) shows how RL can be applied to longitudinal observational data sets in order to obtain
optimal dynamic treatment regimes that yield casual improvements, even if the longitudinal obser-
vational data is subject to hidden confounders. Other approaches in handling mutli-period settings
include Markov decision processes and multi-armed bandit models, which have been applied to
a variety of problems such as in optimizing warfarin dosing (Bastani and Bayati 2020) and joint
immunosuppressive and diabetes medications (Boloori et al. 2020a). Despite their widespread appli-
cations, these methods could fall short in addressing the curse of dimensionality, nonstationary
rewards, and history-dependency that may arise in problems like the one we study in this paper. In
addition, the curse of ambiguity introduced in Saghafian (2018, 2022) often limits the applicability
of these methods when applied to observational longitudinal data sets.
We contribute to this stream by developing a multi-period optimization model based on recurrent
neural networks (RNN) that allow us to analyze our longitudinal data. The RNN framework
also enables us to address some of the foregoing challenges such as dependency on history and
nonstationarity of rewards. To the best of our knowledge, our study is among the first in OR/MS
applying longitudinal deep learning to optimization models. From an application standpoint, to the
best of our knowledge, our proposed framework is the first to simultaneously address side effects
and potential benefits of multi-modal pain treatments. As we show, this is important, as it can
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
Article submitted to ; manuscript no. 7
yield superior insights into how (a) physicians as human experts should prescribe these treatments
in practice, and (b) policymakers should adjust the guidelines.
Finally, a stream of literature suggests that “algorithm aversion” might limit the impact that
using an algorithmic approach such as ours might have in practice, because human experts might
be reluctant to make use of recommendations obtained from algorithms (Dietvorst et al. 2015).
Some recent studies, however, show that humans do possess “algorithm appreciation” (Logg et
al. 2019), and hence, in critical and complex decision-making settings such as the one we study
in this work, they are likely to take into account the advice they receive from a well-designed
algorithm. We believe improving the interpretability and removing the black-box nature of some
of the algorithms can go a long way in this regard. Thus, we adopt solution methodologies that
allow reducing the underlying complexities, and enable obtaining interpretable recommendations.
3. Data and Study Design
The data that we utilize in our study comes from the Merative MarketScan Commercial Disserta-
tion Databases for commercial claims and encounters (“CCAE” for brevity) for years 2008-2010.
There are two main sources of information that we have retrieved from the CCAE databases:
(a) information about patients’ encounters and diagnoses,2and (b) information about the history
of prescriptions.
3.1. Inclusion Criteria
We applied the following inclusion criteria, after which 3,013,637 patients were left in our data set:
(1) Full insurance enrollment in each year during 2008-2010. The CCAE data contains informa-
tion about patients (and their dependents) with private insurance coverage, and changes in such
coverage can result in a discontinuation of a patient’s medical records.
(2) No history of cancer or end-of-life (palliative) status. The CDC guidelines on opioid prescription
do not apply to patients who suffer from cancer or are in their palliative stage (Dowell et al. 2016).
When evaluating the CCAE data, we use the following factors to identify such cases from the ICD-9
diagnoses codes: cancer, neoplasm, malignant, malignancy, benign, carcinoma, and palliative.
(3) No history of congenital anomalies or conditions originated in the perinatal period.
(4) At least one episode of an opioid analgesic prescription.
(5) No opioid prescription within the first 90 days of 2008 (the beginning of our data). Patients
whose records satisfy this condition are called opioid na¨ıve, and their prescription might be related
to a medical condition occurring prior to the beginning of our data. To be consistent with the
medical literature (see, e.g., Johnson et al. (2016)), we select a conservative range of 90 days in
our analysis.
2To determine the diagnoses, we follow the International Classification of Diseases, Ninth Revision, Clinical Modifi-
cation (referred to as ICD-9 hereafter).
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
8Article submitted to ; manuscript no.
(6) No history of opioid side effects within the first 90 days of 2008. If an overdose occurs for a
patient, we must know which medications s/he had used prior to the overdose that might have
had significant impacts on this incidence. Therefore, consistent with criterion (5), we make use of
a 90-day window to exclude patients without sufficient information within our data set.
(7) No opioid prescription occurring prior to the first recorded encounter. We exclude patients that
do not satisfy this criterion, because our aim is to identify a pain-inducing medical condition due
to which an opioid has been prescribed for the first time.
3.2. Independent Variables
A summary of our independent variables are presented in Table 2. Below, we provide further
information on how we have measured these variables.
3.2.1. Treatments: Pharmacologic. We measure the strength and duration of supply for
both opioid and non-opioid medications. We do this in three consecutive phases described below.
Phase 1: Information of original prescribed drugs. We first identify opioid and non-opioid
drugs that have been prescribed to patients. We note that some drugs can have both opioid and
non-opioid components. For example, ‘Acetaminophen/Propoxyphene’ with strength 325mg-50mg
is a drug, where ‘Acetaminophen’ (‘Propoxyphene’) is a non-opioid (opioid) component with the
strength of 325 (50) milligram per tablet. For such drugs, we decompose the generic name and
differentiate the strength of opioid from that of non-opioid. Out of 50 opioid and 51 non-opioid
unique drugs that are prescribed to patients in our data set, we determine 21 opioid and 34 non-
opioid unique components. List of these components are provided in Appendix B.1.
To obtain the strength and duration of supply for the identified medications, we employ the
following information from the CCAE data: the strength per drug unit (STRENGTH), the number
Table 2 Summary of independent variables
VariableAverage (S.D.)Type††
Demographics: gender (female) 57.92% (0.03%) Static
Demographics: age (years) before the first opioid prescription 40.02 (15.76) Dynamic
Treatments–Pharmacologic: strength, opioid (MME) 37.09 (28.26) Dynamic
Treatments–Pharmacologic: strength, non-opioid (mg) 352.21 (230.70) Dynamic
Treatments–Pharmacologic: duration of supply (days) 14.70 (10.40) Dynamic
Treatments–Pharmacologic: use of non-opioid medication pre-supply 14.81% (0.02%) Static
Treatments–Nonpharmacologic: use 45.05% (0.03%) Dynamic
Pathology of pain: (1.71 (0.01), 14.13 (0.02), Static
(no pain, chronic secondary, chronic primary, acute) 27.99 (0.03), 56.18 (0.03))%
Behavioral Risk Factors: history of alcohol consumption 1.31% (0.01%) Dynamic
Behavioral Risk Factors: history of smoking 3.98% (0.01%) Dynamic
Behavioral Risk Factors: history of substance abuse 0.49% (0.00%) Dynamic
Behavioral Risk Factors: history of non-substance abuse 1.11% (0.01%) Dynamic
Behavioral Risk Factors: history of mental health disorder 55.45% (0.03%) Dynamic
History of surgeries 77.16% (0.02%) Dynamic
History of inpatient admissions 22.87% (0.02%) Dynamic
Number of visits in each window (see Remark 1) 1.89 (2.59) Dynamic
History: the variable’s value will not change once it occurs. Rates are reported out of 3,013,637 patients
considered in our study. ††Dynamic (static): the variable can (does not) change over time.
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
Article submitted to ; manuscript no. 9
of units dispensed per prescription (QTY), the number of refills (REFILL), and days of supply for
each refill (DAYSUPP). We then set:
Duration of supply = REFILL + 1DAYSUPP,(1a)
Original strength (per day of supply) = STRENGTH QTYDuration of supply,(1b)
where Equation (1b) is used twice, once for measuring the strength of opioids and once for that
of non-opioids. Furthermore, to test the potential collinearity between the resulted strengths and
duration of supply, we conduct the Pearson’s product-moment correlation test with the null hypoth-
esis (H0): there is no correlation. Based on our results (p-value >0.05), we could not reject the
following null hypothesis at the 95% confidence level.
Phase 2: Transformation of drugs’ strength. We then transfer the original strength of opioid
medications obtained from Equation (1b) to a common measure known as the Morphine Milligram
Equivalent (MME). For example, 1 milligram (mg) of Oxycodone is equal to 1.5 MME, but 1
milligram of Hydromorphone is equal to 4 MME (see, e.g., Palliative Drugs (2009) and CDC
(2020)). For non-opioid medications, we transfer their strengths to a common unit: milligram (mg).
In Table 3, we summarize the information related to pain medications obtained after phases 1 and
2 described above.
Phase 3: Adjustment based on number of prescriptions. To set up time intervals in which
patients visit providers to assess their condition, we consider a time window of Wdays starting
from the beginning of the first opioid supply. To be consistent with the recommended regular
intervals of opioid therapy reassessment (CDC 2016), we set W= 30 in our study. We then adjust
the strengths and duration of supply based on the number of times medications are prescribed in
each 30-day time window. Figure 1 illustrates an example of this adjustment. It should be noted
that different durations of supply are additive. However, different strengths are not additive. For
example, in Figure 1, we have a total of 10 + 5 =15 days of supply in the time window, but we
cannot measure the total strength as 30+50=80 MME. In addition to the foregoing variables, and
to thoroughly explore the efficacy of pain medications, we consider the history of using non-opioid
medications prior to the first opioid supply.
Table 3 Summary of information related to pain medications after phases 1 and 2
Description Value
Number of pain medication prescriptions 11,302,380
Number of refills Average (standard deviation) = 0.22 (0.91)
Duration of supply (days) Average (standard deviation) = 18.05 (26.86)
Sstrength, opioid (MME) Average (standard deviation) = 38.13 (42.40)
Strength, non-opioid (mg) Average (standard deviation) = 1,041.51 (489.92)
Method of drug consumption (out of all prescriptions) Oral (99.42%), Transdermal (0.58%)
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
10 Article submitted to ; manuscript no.
t
First supply:
Strength (opioid) = 30 MME
Strength (non-opioid) = 100 mg
Duration = 10 days
Day
t +W
Time window (W days)
Second supply:
Strength (opioid) = 50 MME
Strength (non-opioid) = 60 mg
Duration = 5 days
Total duration of supply = 10 + 5 = 15 days
Adjusted strength (opioid) =
(10*30 + 5*50)/(10 + 5) = 36.67 MME
Adjusted strength (non-opioid) =
(10*100 + 5*60)/(10 + 5) = 86.67 mg
Figure 1 Adjustment of strengths and duration of supply based on the number of prescriptions
3.2.2. Treatments: Non-Pharmacologic. A multi-modal pain treatment involves both
pharmacologic treatments (e.g., opioid and non-opioid medications) and non-pharmacologic treat-
ments (e.g., physical therapy, chiropractic, and acupuncture). We identify the use of non-
pharmacologic treatments from our data set via codes related to procedure groups, provider types,
and service categories (see Table 4).
3.2.3. Pathology of Pain. We identify the clinical condition(s) based on which an opioid
medication is prescribed in the first place. However, in claims data, the information on drugs
prescription is not typically linked to visits and corresponding diagnoses. As a result, a pain-
inducing condition is not necessarily identifiable. To address this, we focus on the pre-supply period
defined as the time period prior to the first opioid prescription (i.e., prior to the first time window
in our study period).3Specifically, we identify acute or chronic pain conditions by exploring ICD-9
diagnoses codes within the pre-supply period. For acute pain, we include acute conditions, injuries,
or accidents that are explicitly labled in ICD-9 codes, or other symptoms that are either indicative
of critical conditions, or would typically require immediate medical care or surgical procedures. For
brevity, we hereafter refer to these conditions as “acute,” and term the other conditions pertaining
pain as “chronic.”
We observe from our data that there are typically two pathological paths for chronic pain that
could warrant a pain treatment. In the first path, there is no secondary medical condition involved,
and the chronic pain is primarily managed by pain medications (e.g., migraine). In the second
path, pain is induced by a secondary condition (e.g., hypertensive heart disease or pneumonia), and
addressing this secondary condition (e.g., by prescribing specialty drugs or performing designated
treatments) typically precedes prescribing pain medications. We refer to these two paths as primary
and secondary chronic pain conditions, respectively. In Appendix B.2, we represent ICD-9 diagnoses
codes that we have used to identify these conditions.
3Pain medications are typically supplied within a short period after observing a pain-inducing condition (e.g.,
surgery). Nevertheless, we consider a pre-supply period of 90 days to be conservative.
Table 4 Codes used to identify non-pharmacologic treatments
Procedure groups: 181–189 (physical medicine), 191 and 195 (chiropractic/spinal manipulation)
Provider types: 120 (chiropractor), 140 (pain management), 350 (physical medicine and rehab), 865 (acupuncturist),
870 (spiritual healers)
Service categories: 3xy18 for x {0,1}and y {1,2,3,4,5,6}(behavioral health therapy)
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
Article submitted to ; manuscript no. 11
3.2.4. Behavioral Risk Factors. We consider five types of behavioral factors, including
alcohol consumption, smoking, substance drug disorders, non-substance drug disorders, and mental
health disorders. In Appendix B.3, we provide more details on how we have identified these factors
using our data.
3.2.5. Surgeries and Inpatient Admissions. Finally, we involve information related to
whether a patient has had a surgical procedure and/or received an inpatient admission. We do so
because the treatment of post-surgical pain with opioid medications has been reported as a cause for
an onset of post-operative opioid addiction or long-term opioid usage after hospital discharge (see,
e.g., Pletcher et al. (2008) and Wu et al. (2019)). In Appendix B.4, we provide further information
on how we have identified surgery and inpatient admissions using our data.
3.3. Dependent Variables
Our dependent variables capture the occurrence of two events: opioid use disorder (OUD) and
undertreated pain (UTP). We term these as Events 1 and 2, respectively, and define them below.
A suitable treatment is one that avoids both of these events.
3.3.1. Event 1 (OUD). We define Event 1 so that captures whether the patient experiences
any side effect that can be attributed to OUD.
Definition 1 (Event 1). We say Event 1 has occurred within a time window if, in that win-
dow, there is at least one incidence of dependence, abuse, poisoning, or any adverse effect caused
by an opioid.
In Appendix C.1, we present the ICD-9 codes that we have used to identify the occurrence of
Event 1. We note that if Event 1 occurs, it can be very detrimental to the patient (e.g., cause death)
even if it happens only once. Furthermore, among 7,518 patients who experienced Event 1 in our
data, 7,408 (98.54%) experienced it only once within a time window (W= 30 days). Therefore, in
our analysis, we differentiate between no occurrence and at least one occurrence of Event 1 during
each time window.
3.3.2. Event 2 (UTP). Pain relief is the first outcome measure in assessing the efficacy of
pain medications (see, e.g., Teater (2015) and Smith et al. (2018)). When a patient is prescribed
with a medication to address a pre-existing pain-inducting condition, but has post-supply visits due
to the same condition, it is likely that the visits are due to undertreated pain (see, e.g., McPhillips-
Tangum et al. (1998) and Xiao and Barber (2008)). As mentioned earlier, however, information
on medications prescription is not typically linked to visits and corresponding diagnoses in claims
data. As a result, identifying pain-inducing conditions that have triggered an opioid medication
requires taking some extra steps. To this end, for each patient, we identify some Baseline Medical
Conditions (BMCs, described below) and make use of them to define Event 2 (see Definition 2).
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
12 Article submitted to ; manuscript no.
Baseline Medical Conditions (BMCs). In the absence of a surgical procedure, the purpose of
prescribing painkillers is typically addressing existing acute or chronic pain conditions (for discus-
sions related to the pathology of pain, see
§
3.2.3). Thus, for patients without surgical procedures,
we characterize the BMCs using acute or chronic pain conditions pre-supply. In contrast, having a
surgical procedure pre-supply would impact the characterization of the BMCs in two ways. First, a
provider may prescribe pain medications because s/he is concerned about the underlying acute or
chronic pain-inducing condition that prompted the surgery in the first place. Second, the provider
could prescribe pain medications because of concerns related to pain conditions or complications
arising after the surgical procedure in the pre-supply period. Therefore, for patients with surgeries,
we characterize the BMCs based on a combination of (a) acute/chronic pre-surgical conditions,
and (b) post-surgical pain-related conditions (PSPs) (see Appendix C.2 for the characterization of
PSPs). Figure 2 illustrates the full steps taken to characterize the BMCs. Of note, in characterizing
the BMCs, we assume that an opioid prescription is attributed to an acute condition first and then
to a chronic condition. This is also consistent with the CDC’s recommendations (see, e.g., CDC
(2016)).
Definition 2 (Event 2). Let DXnbe a diagnosis among visits in time window n1. Then, we
say that Event 2 has occurred in time window nif DXnis among the baseline medical conditions:
Event 2 (in window n) = (1,if DXnBMCs,
0,if DXn∈ BMCs.(2)
A value of 1 for Event 2 in a time window indicates that a baseline medical condition still exists
in that window. Intuitively, compared to Event 2 = 1, Event 2 = 0 can indicate some potential
benefits in using pain treatments.
Remark 1 (Impact of Repeated Visits). We establish Definition 2 and its impacts on the
potential benefits of pain treatments based on the premise that repeated visits for a similar medical
condition could indicate undertreated pain. However, we must also account for cases where a patient
visits providers too often. Vising a provider too often has two implications: it could inflate the
occurrence of Event 2, and it might divulge the drug-seeking behavior of that patient, which is
known to be a contributing factor to opioid use disorders (see, e.g., Grover et al. (2012)). Therefore,
in addition to the independent variables described in
§
3.2, we adjust for another variable in our
analysis: the total number of visits in each time window.4
4. The Analytical Setting
We use longitudinal machine learning algorithms to predict the risks of Events 1 and 2 with high
accuracy for each individual. We then feed the best machine learning algorithm to an optimization
4Based on our data, the average (s.d.) of the number of visits in each window (W= 30 days) is 1.89 (2.59).
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
Article submitted to ; manuscript no. 13
Yes
No
Yes
No
Yes
Yes
No
Yes
No
No
Yes
No
Q1
Q2
Q5
Q3
Remove patient
BMCs = DXssecondary
Q4
BMCs = DXsprimary
Q6
Yes
No Q7
BMCs = PSPs U DXsprimary before surgery
BMCs = PSPs U DXssecondary before surgery
BMCs = PSPs
BMCs = DXsacute
BMCs = PSPs U DXsacute before surgery
840,702
Q2: is there any diagnosis from DXsacute?
Q5: before surgery, is there any diagnosis from DXsacute?
Q3: is there any diagnosis from DXsprimary?
Q4: is there any diagnosis from DXssecondary?
Q6: before surgery, is there any diagnosis from DXsprimary?
Q7: before surgery, is there any diagnosis from DXssecondary?
In the pre-supply period:
Q1: is there any surgical procedure?
566,172
324,305
33,305
852,237
277,352
101,483
18,082
Figure 2 Characterization of the Baseline Medical Conditions (BMCs)
Notes. DXsacute, DXsprimary , DXssecondary , and PSPs: sets of diagnoses codes for acute pain, chronic primary
pain, chronic secondary pain, and post-surgical pain-related conditions, respectively. The value at the end of each
branch represents the number of unique patients under that category based on the CCAE data.
model that allows us to determine cost-effective pain treatments based on each patient’s charac-
teristics. We note that the black-box nature of the machine learning methods, along with the curse
of dimensionality in the optimization model (due to multiple treatment modes), could negatively
impact the implementation and interpretability of our approach for use in practice. Therefore, we
propose two heuristics that are easy to adopt in a decision-support system. Later, in
§
5, we will
compare the performance of these algorithmic-based heuristics with the human-based practices
and show better cost-effective results under our proposed approaches.
4.1. Longitudinal Machine Learning Algorithms
Most Machine Learning (ML) algorithms are used for cross-sectional studies, where subjects’ (e.g.,
patients’) information is recorded only at one point in time. However, our data is longitudinal
in that patients are monitored and treated over time, and thus, their information, variables, and
measurements dynamically change. Thus, we make use of ML methods that are suitable for lon-
gitudinal settings. Generalized Estimating Equations Logistic Regression (GEE Logit) is the first
method that we use. GEE Logit takes advantage of the simplistic nature of logistic regression,
while accounting for the longitudinal aspect of the data (see, e.g., Wilson and Lorenz (2015)). The
second method we use is based on tree ensembles for longitudinal data (see, e.g., Capitaine et al.
(2019) and Mi˘si´c (2020)). In particular, we adopt Historical Random Forest (HRF) (Sexton 2018).
The third and fourth methods we use are based on Artificial Neural Networks. In particular, we
make use of a Recurrent Neural Network (RNN) (Rumelhart et al. 1986) as well as a Long Short-
Term Memory (LSTM) (Hochreiter and Schmidhuber 1997). Unlike GEE Logit, the other methods
require tuning hyperparameters. In Table 5, we describe these parameters. For HRF, we have a
tuple (t, n, v) which forms a three-dimensional grid of candidate values with 5×4×7 = 140 different
combinations. For RNN and LSTM, we have a tuple (h, r, γ, w) which forms a three-dimensional
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
14 Article submitted to ; manuscript no.
grid of candidate values with 53= 125 different combinations. As noted by Hastie et al. (2009), in
neural-network-based methods, when the number of epochs (denoted by r) increases to infinity,
the learning rate (denoted by γ) decreases to zero. To reflect on this, we set γ= 10(r10)/10 (e.g.,
for r= 20, γ= 0.1), and do not consider variations for γseparately.
4.1.1. Comparison of ML Methods under Various Hyperparameters. After data pre-
processing (see Appendix D), we apply our longitudinal ML methods to our data and compare their
performance. We do so by measuring the area under the curve (AUC) under each method. The
AUC values are, in turn, calculated using two validation mechanisms: 10-fold cross-validation (CV)
and out-of-sample validation (OOS). Regarding CV, we use a 10-fold CV approach: we train the
ML method based on 90% of randomly selected patients, and test it on the remaining 10%. We then
repeat this procedure 10 times, and report the average AUC across these 10 folds. Regarding OOS,
we create our train/test data sets based on the year in which a patient’s information was initiated.
Specifically, we first create a training set from patients whose first encounter are recorded in 2008-
2009, and then test the trained method on patients whose first encounters are in 2010. Afterwards,
we create another training set based on 2008, and test the trained method on patients whose first
encounter are recorded in 2009-2010. We then report the average AUC across these two iterations.
In Table 6, we show the best hyperparameters—resulting in the highest AUC. In Figure 3, we
depict the AUC values obtained from our ML methods under their best hyperparameters. Based
on these results, we choose RNN with (h, r, γ, w) = (50,20,101,103) as the ML method with the
best performance. In
§
4.2, we feed this trained RNN to an optimization model to determine the
best multi-modal pain treatments.
Table 5 Hyperparameters for our ML algorithms
Method Parameter Candidate value(s)
HRF t: number trees to grow (100, 200, 500, 800, 1,000)
n: minimum node size (2, 5, 8, 10)
v: number of variable candidates to sample at each split (1, 2, 3, 4, 5, 8, 12)§
RNN/LSTMh: hidden dimension (size of the hidden layer) (10, 20, 50, 100, 200)††
r: number of epochs (iterations over a training data) (5, 10, 20, 50, 100)††
γ: learning rate of the algorithm γ= 10(r10)/10
w: learning rate (weight) decay (101, 102, 103, 104, 105)
The same hyperparameters are used for these two methods. Candidate values are set based on values in
Hastie et al. (2009). §We use 1.5i(x/3)for i=4,. . . , 2 and x: # independent variables (see, e.g., Bertsimas
et al. (2016)). †† Candidate values are set based on those recommended by Choi et al. (2016).
Table 6 Best hyperparameters for each ML algorithm under different validation mechanisms
Event Method Validation
10-fold cross validation Out-of-sample
OUD GEE Logit
HRF (t, n, v) = (200,5,2) (t, n, v) = (500,8,2)
RNN (h, r, γ, w ) = (50,20,101,103) (h, r, γ, w) = (50,10,1,103)
LSTM (h, r, γ, w ) = (100,20,101,103) (h, r, γ, w) = (20,50,104,103)
UTP GEE Logit
HRF (t, n, v) = (500,2,2) (t, n, v) = (500,2,2)
RNN (h, r, γ, w ) = (50,20,101,103) (h, r, γ, w) = (100,20,101,105)
LSTM (h, r, γ, w ) = (100,10,1,103) (h, r, γ, w) = (50,10,1,105)
ROC curves under the best hyperparameters are shown in Figure 3.
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
Article submitted to ; manuscript no. 15
0 20 40 60 80 100
Specificity (%)
Sensitivity (%)
Logit: AUC = 97.00%
HRF: AUC = 95.40%
RNN: AUC = 96.55%
LSTM: AUC = 90.21%
0
100 80 60 40 20 0
(a) Event 1, validation: 10-fold CV
0 20 40 60 80 100
Specificity (%)
Sensitivity (%)
Logit: AUC = 97.14%
HRF: AUC = 98.21%
RNN: AUC = 95.56%
LSTM: AUC = 93.92%
0
100 80 60 40 20 0
(b) Event 1, validation: out-of-sample
0 20 40 60 80 100
Specificity (%)
Sensitivity (%)
Logit: AUC = 73.55%
HRF: AUC = 82.00%
RNN: AUC = 86.84%
LSTM: AUC = 82.05%
0
100 80 60 40 20 0
(c) Event 2, validation: 10-fold CV
Logit: AUC = 72.45%
HRF: AUC = 81.49%
RNN: AUC = 91.44%
LSTM: AUC = 81.95%
(d) Event 2, validation: out-of-sample
Figure 3 (Color online) Receiver operating characteristics curves under the best hyperparameters
Notes. Logit: generalized estimating equations logistic regression, HRF: historical random forest, RNN: recurrent
neural network, LSTM: long short-term memory.
4.2. Optimization Model
We aim to determine multi-modal pain treatment plans that are most cost-effective. To this end,
for any treatment plan, we measure the net monetary benefit (see, e.g., Drummond et al. (2015)):
Net monetary benefit = WTP QALY COST,(3)
where QALY is the quality-adjusted life years that a patient can accrue, WTP is the willingness to
pay for an additional QALY, and COST is the total cost of care (e.g., reimbursement, co-payment,
etc.).
Let A=Aos×Anos×Ads×Anph be the set of all feasible actions representing all possible
multi-modal treatment options (
×: the Cartesian product).5Aos = [0,100] MME and Anos =
[0,700] mg are feasible intervals for the opioid and non-opioid strengths, respectively. Ads =
{0,1,...,30}days is the feasible set for the duration of supply of medications, and Anph =
5Summary of all notations is provided in Appendix A.
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
16 Article submitted to ; manuscript no.
{0 : No non-pharmacologic treatment,1 : Use non-pharmacologic treatment}is the set of possi-
ble actions with respect to non-pharmacologic treatments.6Also, let a=(a1,...,aT) ATbe a
sequence of multi-modal treatments (e.g., doses of opioid and non-opioid medications, their dura-
tion of supply, and use of non-pharmacologic treatments) taken throughout the time horizon T,
and Ht=(V1,a1,...,Vt1,at1,Vt)be the history of the patient’s covariates up to time window
t(denoted by V1,V2,...,Vt) as well as all the prior treatments. For Event k {1,2}, we measure
the cost and QALY accrued over the time horizon by:
COSTk(a) = XT
t=1 βt1Pk(at|Ht)Ck,(4a)
QALYk(a) = XT
t=1 βt1rk(at|Ht),(4b)
where β[0,1) is a discount factor that allows us to prioritize the outcomes in the current period
over those in the future, Pk(at|Ht)is the probability of experiencing Event k(obtained by our
trained RNN) when the patient’s history is Htand treatment is at,Ckis the total cost accrued due
to an occurrence of Event k(including all payments to providers), and rk(at|Ht)is the immediate
QALY gained under atand Ht. We denote by qk0(qk1) the monthly quality-of-life (qol) score for
a patient who does not (does) experience Event k, where 0 qk1qk0. Using this notation, the
immediate QALY is measured as:
rk(at|Ht)=1Pk(at|Ht)×qk0+Pk(at|Ht)×qk1
=qk0(qk0qk1)×Pk(at|Ht)for tT,
(5)
where the first (second) part of summation on the first line indicates the expected QALY for a
patient who is not experiencing (experiencing) Event k.
Using Equations (3)-(5), we measure the net benefit for Event k {1,2}as:
NBk(a) = WTP ×QALYk(a)COSTk(a)(6a)
=XT
t=1βt1hqk0×WTP Pk(at)(qk0qk1)×WTP + Cki,(6b)
where, for notational brevity, we suppress the dependency of Pk(·) and subsequent notations on
Ht.
6100 MME and 700 mg are the 95th-percentile of values in our data for opioid and non-opioid strengths, respectively.
Furthermore, the maximum duration of supply is equal to the length of a time window. Also, among patients who
ever used non-pharmacologic treatments, 83.08% used either physical therapy or chiropractic. Therefore, to reflect on
whether any non-pharmacologic option should be used, we consider a binary variable. However, we do not differentiate
treatments based on the specific type of non-pharmacologic treatment, since the majority of non-pharmacologic
treatment use relates to either physical therapy or chiropractic.
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
Article submitted to ; manuscript no. 17
Using (6a)-(6b), we aim to find, for each patient, the treatment plan that maximizes the weighted
average net benefit:
max
a∈ATnNB(a) = X2
k=1 wkNBk(a)ofor w1, w2[0,1] s.t. w1+w2= 1,(7)
where wkis the weight assigned to Event k.
4.2.1. Heuristic Solution Methods. In finding optimal multi-modal pain treatments, we
note that the feasible values for the opioid and non-opioid strengths are continuous. Even consid-
ering a grid-based uniform discretization (e.g., {0,10,...,100}MME and {0,50,...,700}mg for
opioid and non-opioid strengths, respectively) results in 11 ×15 ×31 ×2 = 10,230 different treat-
ment combinations to be explored within the feasible action space Afor each time window causing
a curse of dimensionality. To address this curse, we consider two heuristic approaches: myopic and
rolling-horizon. In the myopic approach, we select the treatment resulting in the highest average
net benefit in each window. That is, we only value the current outcomes and do not consider the
impact of current decisions on future outcomes. In the rolling-horizon approach, we determine the
sequence of optimal treatments for a sub-horizon that is computationally tractable (in our experi-
ments, we consider a sub-horizon of 3 time windows), and move this sub-horizon one time-window
forward as we go. Compared to the myopic policy, this accounts for the impact of current decisions
on both current and future outcomes (see Table 7 for a summary of these heuristic approaches).
5. Numerical Experiments
To perform our main numerical experiments, we make use of the parameter values shown in Table 8.
In
§
5.2, we perform extensive robustness checks on these parameters, and test the validity of our
main findings by varying them.
5.1. Comparison with the Human-Based Guidelines and Practice
5.1.1. Cost-Effectiveness of Algorithmic-Based Treatment Policies. In this section,
we compare the cost-effectiveness of our algorithmic-based treatment policies with those of the
CDC guidelines as well as the human-based approaches followed in practice by physician experts,
as evidenced from our data. To this end, we develop a simulation model where we first estimate
distributions of multi-modal pain treatments from our data. Then, we adjust these estimated
distributions by the thresholds recommended by the CDC 2016 guidelines (further details are
provided in Appendix E). For any WTP >0, our treatment policy is said to be more cost-effective
than the guidelines if (see, e.g., Drummond et al. (2015)):
Incremental Cost-Effectiveness Ratio (ICER) =
Incremental Cost
Incremental QALY =Cost(our policy) Cost(guidelines)
QALY(our policy) QALY(guidelines) WTP.(8)
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
18 Article submitted to ; manuscript no.
Table 7 Summary of heuristic solution approaches
Inputs: trained RNN, action space A(grid-based), time horizon T, sub-horizon L < T , discount factor β
vectors of patient covariates V1, . . . , VT,H1= (V1)
Solution Approach: RNN-Myopic
1 for t= 1, . . . , T
2 for each at A
3measure the risk of experiencing Event 1 (OUD) and Event 2 (UTP) by the trained RNN
4measure the weighted avg. net benefit NB(at|Ht) by Equations (6a)-(7)
5select treatment a
t= arg maxat∈A {NB(at|Ht)}
6 if tT1
7update history Ht+1 = (V1,a
1,...,Vt,a
t,Vt+1)
Solution Approach: RNN-Rolling-Horizon
1 for t= 1, . . . , T
2 for l=t, . . . , t +L1
3 for each al A
4measure the risk of experiencing Event 1 (OUD) and Event 2 (UTP) by the trained RNN
5measure the weighted avg. net benefit NB(at|Ht) by Equations (6a)-(7)
6select the sequence of treatments a
t, . . . , a
t+L1= arg maxat,...,at+L1∈A {Pt+L1
l=tβltNB(al|Hl)}
7 if tT1
8update history Ht+1 = (V1,a
1,...,Vt,a
t,Vt+1)
Output: treatments a
1, . . . , a
T
Table 8 Summary of baseline parameters
Parameter Description
T= 12 Time horizon (12 months)
WTP = 10,000 (
$
/QALY) Willingness to pay
q10 =q20 = 1/12 = 0.083 yrMonthly qol score for not experiencing Events 1 or 2 (i.e., perfect health)
q11 = 0.368/12 = 0.031 yrMonthly qol score for experiencing Event 1 (see, e.g., De Maeyer et al. (2010))
q21 =(0.395/12 = 0.033 yr (acute)
0.470/12 = 0.039 yr (chronic)
Monthly qol score for experiencing Event 2 (for an acute and chronic pain
pre-supply) (see, e.g., Katz (2002), Wu et al. (2003), uz¨un (2007), Ataoˇglu et al.
(2013), Taylor et al. (2013), Hadi et al. (2019) and references therein)
C1=
$
15,588.07 Costs associated with an occurrence of Event 1§
C2=($6,172.86 (acute)
$2,714.90 (chronic)
Costs associated with an occurrence of Event 2 (for an acute and chronic pain)§
w1=w2= 0.5 Weights assigned to Events 1-2
Yearly qol scores are converted to their monthly equivalents. We consider a same qol score for both genders (see, e.g.,
Giacomuzzi et al. (2005) and Domingo-Salvany et al. (2010)). §To estimate these values, we first obtain the average cost
from our data, and then use the average U.S. healthcare inflation rate (USIC 2021) to prorate the costs from years
2008-2010 (time frame of our data) to year 2021.
We iterate our simulation 10,000 times to account for variations in dynamic risk covariates and
treatments generated by the guidelines. The percentage of instances (out of 10,000) that satisfy
(8) measures the cost-effectiveness probability (see, e.g., Fenwick et al. (2006)).
The CDC has recently (in 2022) put forth a new set of guidelines to rectify some of the issues
attributed to the 2016 guidelines. In particular, they proposed to remove the recommended thresh-
olds for the opioid dose and duration of supply (90 MME for opioid dose and a total of 7 (90)
days for the duration of supply for acute (chronic) pain). However, some believe that the new
guidelines have the potential for human misunderstanding or misapplication, especially since they
still emphasize on the harms of opioids rather than the harms of poorly managed pain. In addi-
tion, some other experts are wary against removing the aforementioned thresholds, and believe
that doing such could make it more difficult for physicians who have relied on these thresholds
in their practices over the past six years (for more details, see CDC (2022) and NPR (2022)). To
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
Article submitted to ; manuscript no. 19
gain some insights, we also compare our algorithmic-based approach with the CDC guidelines after
incorporating the 2022 modifications (see Appendix E for more details).
For both the 2016 and 2022 guidelines, we also run our simulations under various opioid tapering
rates. We do so because in practice an initial opioid dose is typically tapered down over time by
a linear rate exogenously (see, e.g., CDC (2016) and FDA (2019)). For example, if an initial dose
is 50 MME and the tapering rate is 20% per month, then the resulted dose in the second month
will be 50 ×0.8 = 40 MME. This is done with the purpose of balancing the risks of opioid-related
disorders and opioid-withdrawal symptoms. To reflect on this, we simulate each set of guidelines
under different tapering rates denoted by θ {0.05,0.20,0.50}(VA.gov 2016).
Finally, we note that, under the treatment policies we obtain, the duration of supply can be
a fraction of a time window, which might cause a temporary discontinuation of medications. As
indicated by FDA (2019), “Health care professionals should not abruptly discontinue opioids in a
patient who is physically dependent.” To address this, we simulate another version of our treatment
policies in which we disallow any discontinuation with respect to chronic pain management. That
is, we fix the duration of supply in each window to be equal to 30 days. Of note, the supply is still
equal to zero in a time window if opioid and non-opioid doses are both zero in that window.
Based on our results in Figure 4, we make the following observation.
Observation 1. (i) Our algorithmic-based treatment policies are more cost-effective than both
the CDC 2016 and 2022 guidelines, but this effectiveness typically reduces as WTP increases.
(ii) The 2022 CDC guidelines are more cost-effective than the 2016 CDC guidelines.
Observation 1(i) indicates that our proposed approach yields treatment policies that are more
cost-effective than the CDC guidelines. This is to some extent expected, because our algorithmic-
based treatment policies are obtained by maximizing the net monetary benefit (see Equation (3)).
Our results, however, show that even when we disallow any discontinuation of medications for
chronic pain management, our obtained policies are still more cost-effective than the CDC guide-
lines. However, this effectiveness drops as we increase WTP. For example, compared to the 2022
CDC guidelines (with a tapering rate θ= 0.05), the treatment policy we obtain with no discontin-
uation of supply is cost-effective in 96.03% and 85% of simulated instances under WTP =
$
50K
and
$
100K, respectively.
Observation 1(ii) implies a better performance under the 2022 guidelines than the 2016 ones. For
example, compared to the CDC 2016 guidelines (with a tapering rate θ= 0.50), our algorithmic-
based treatments with no discontinuation of supply is cost-effective in 97% and 76% of simulated
instances under WTP =
$
50K and
$
100K, respectively. However, these numbers under the 2022
CDC guidelines drop to 90.09% and 66.64%, respectively. This can be due to the fact that the 2022
guidelines have lifted the recommended thresholds on the opioid dose and days of supply, which
could, in turn, help reduce instances of undertreated pain. As the public comment period for the
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
20 Article submitted to ; manuscript no.
θθθ
0.05
0.20
0.50
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Rolling horizon vs. guidelines (16)
Cost−effectiveness probability
θθθ
0.05
0.20
0.50
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Rolling horizon (ND) vs. guidelines (16)
θθθ
0.05
0.20
0.50
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Myopic vs. guidelines (16)
θθθ
0.05
0.20
0.50
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Rolling horizon vs. guidelines (22)
Cost−effectiveness probability
θθθ
0.05
0.20
0.50
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Rolling horizon (ND) vs. guidelines (22)
θθθ
0.05
0.20
0.50
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Myopic vs. guidelines (22)
Figure 4 Cost-effectiveness of our algorithmic-based policies compared to the human-based guidelines
Notes. x-axes represent WTP (
$
1,000/QALY). Results are the avg. cost-effectiveness probabilities from simulation instances. θ
is the opioid dose-tapering rate. ND: no discontinuation of medications supply for chronic pain. 16 and 22 represent the CDC’s
2016 and 2022 medical guidelines, respectively. The lower the probability, the better the performance of the guidelines compared
to our proposed policies.
first draft of the new guidelines recently ended in April 2022, the CDC will soon evaluate the final
recommendations (NPR 2022). We hope that our findings here could provide further insights for
healthcare policymakers in their efforts in improving medical guidelines based on clinical evidence.
Finally, our results show that the average (s.d.) improvement in the QALY and cost per patient
per year due to using our treatment policy are 2.82 (2.24) days and
$
461.46 (
$
635.92), respectively.
Given that a large number of patients are affected by opioid prescriptions each year, and that
the opioid pandemic has been lingering for many years, we believe these per patient per year
improvements offer significant advantages at the society level for policymakers to reconsider some
of their recommendations.
5.1.2. Multi-Modal Pain Treatments. In the previous section, we reported the results for
an average patient whose risk covariates could change throughout the time horizon. In
§
5.1.2-5.1.4,
we aim to analyze medical outcomes based on patients’ heterogeneity. To this end, and to handle
the sheer volume of covariates scenarios, we form cohorts of patients based on their characteristics
pre-supply, such as age, gender, pain pathology, history of surgery or an inpatient admission, and
history of behavioral factors (more details about these cohorts are provided in Appendix F.1).
For each cohort of patients, Figure 5 shows the multi-modal pain treatment plan we obtain from
our rolling-horizon approach as well as those we observe physicians have followed based on our data.
The results indicate that for any type of pain, existence of various heterogeneity factors, such as
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
Article submitted to ; manuscript no. 21
age, gender, and history of behavioral factors, has hardly any impact on the intensity of treatments
prescribed by physicians in the practice. This “one-size-fits-all” type of pain management was also
suggested in the CDC 2016 guidelines, where no more than 50-90 MME for the opioid strength or
no more than 7 (90) days of opioid supply for acute (chronic) pain were recommended (see, e.g.,
Dowell et al. (2016) and AZDHS (2017)). To the best of our knowledge, there is no evidence-based
information on how these treatments should vary based on patients’ characteristics. Thus, we make
the following observation from our results in Figure 5.
Observation 2. Compared to the current human-based approach, our algorithmic-based treate-
ment policies result in
(i) more (less) intensive opioid regimens for patients with chronic pain and no pre-supply history
of behavioral factors (acute or chronic pain and a pre-supply history of behavioral factors),
(ii) more days of medications supply for females in their 40’s with either both history of
surgery/inpatient admission and behavioral factors pre-supply (when having chronic pain) or nei-
ther of these histories (when having acute or chronic pain),
(iii) more days of medications supply for males in their 50’s with either no history of surgery or
inpatient admission pre-supply (when they have chronic pain and a history of behavioral factors),
or a history of surgery or inpatient admission pre-supply (when they have acute or chronic pain
and no history of behavioral factors), and
(iv) more (less) intensive use of non-pharmacologic treatments (non-opioid regimens) across all
cohorts of patients.
Observation 2(i) implies that patients with chronic pain and no history of behavioral factors
pre-supply (e.g., alcohol consumption or mental health disorders) will be prescribed with a higher
opioid dose under our treatment policies compared to the observed practice. Also, for females in
their 40’s or those with a history of surgeries/inpatient admissions pre-supply, this higher dose will
be prescribed more aggressively (i.e., the difference in dose is more significant).
As mentioned before, medical guidelines recommend an initial opioid dose to be tapered down
exogenously (see, e.g., FDA (2019)). Compared to this approach, our algorithm-driven treatment
policies account for various risk covariates to personalize this tapering behavior. This induces
an endogenous dose tapering, which is often nonlinear in time. For example, among patients
with acute/chronic pain and no history of behavioral factors pre-supply, our policies result in
steeper tapering for the following cohorts: females in their 40’s and patients with a history of
surgeries/inpatient admissions pre-supply.
Furthermore, our results indicate that, when a patient has a chronic pain pre-supply, the duration
of supply should increase in the early stage of therapeutic course and then be reduced towards
the end of time horizon. This finding holds for patients with different age, gender, behavioral
factors, and surgery/inpatient admission records. The medical reason behind this finding is that
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
22 Article submitted to ; manuscript no.
0
20
40
60
80
Practice
Optimal
Strength: opioid (MME)
2 4 6 8 10 12
0
100
200
300
400
500
600
700
Practice
Optimal
Strength: non−opioid (mg)
2 4 6 8 10 12
0
200
400
600
0
5
10
15
20
25
30
Practice
Optimal
Days supply
2 4 6 8 10 12
0
10
20
30
0.0
0.2
0.4
0.6
0.8
1.0
Practice
Optimal
Non−pharm treatments use
Pre−supply pain: acute
2 4 6 8 10 12
0
20
40
60
80
Practice
Optimal
2 4 6 8 10 12
0
100
200
300
400
500
600
700
Practice
Optimal
2 4 6 8 10 12
0
200
400
600
0
5
10
15
20
25
30
Practice
Optimal
2 4 6 8 10 12
0
10
20
30
0.0
0.2
0.4
0.6
0.8
1.0
Practice
Optimal
Pre−supply pain: chronic P
2 4 6 8 10 12
0
20
40
60
80
Practice
Optimal
2 4 6 8 10 12
0
100
200
300
400
500
600
700
Practice
Optimal
2 4 6 8 10 12
0
200
400
600
0
5
10
15
20
25
30
Practice
Optimal
2 4 6 8 10 12
0
10
20
30
0.0
0.2
0.4
0.6
0.8
1.0
Practice
Optimal
Pre−supply pain: chronic S
2 4 6 8 10 12
(a) Female, 40 years old, surgery/inpatient adm pre-supply: yes, history of behavioral factors pre-supply: no
0
20
40
60
80
Practice
Optimal
Strength: opioid (MME)
2 4 6 8 10 12
0
100
200
300
400
500
600
700
Practice
Optimal
Strength: non−opioid (mg)
2 4 6 8 10 12
0
200
400
600
0
5
10
15
20
25
30
Practice
Optimal
Days supply
2 4 6 8 10 12
0
10
20
30
0.0
0.2
0.4
0.6
0.8
1.0
Practice
Optimal
Non−pharm treatments use
Pre−supply pain: acute
2 4 6 8 10 12
0
20
40
60
80
Practice
Optimal
2 4 6 8 10 12
0
100
200
300
400
500
600
700
Practice
Optimal
2 4 6 8 10 12
0
200
400
600
0
5
10
15
20
25
30
Practice
Optimal
2 4 6 8 10 12
0
10
20
30
0.0
0.2
0.4
0.6
0.8
1.0
Practice
Optimal
Pre−supply pain: chronic P
2 4 6 8 10 12
0
20
40
60
80
Practice
Optimal
2 4 6 8 10 12
0
100
200
300
400
500
600
700
Practice
Optimal
2 4 6 8 10 12
0
200
400
600
0
5
10
15
20
25
30
Practice
Optimal
2 4 6 8 10 12
0
10
20
30
0.0
0.2
0.4
0.6
0.8
1.0
Practice
Optimal
Pre−supply pain: chronic S
2 4 6 8 10 12
(b) Female, 40 years old, surgery/inpatient adm pre-supply: yes, history of behavioral factors pre-supply: yes
Figure 5 Multi-modal pain treatments obtained from our rolling-horizon policy and the practice
Notes. x-axes represent time windows (month). Results from the rolling-horizon policy are obtained under WTP =
$
10K. Shades
represent 95% confidence intervals. Supply is zero when the opioid/non-opioid strengths are both zero. non-pharmacologic
treatments: 1 (0) implies use (no use). Chronic P/S: chronic primary/secondary. Results for other cohorts are presented in
Appendix F.2.
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
Article submitted to ; manuscript no. 23
the increase in the duration of supply could balance the tapering down of medications, which, in
turn, can mitigate the potential risks of opioid withdrawal symptoms.
Observation 2(iv) indicates that human-based approaches (in which a human-based judgment
is used to assess the potential benefits and risks) typically prescribe more intensive non-opioid
medications compared to our algorithmic-based treatments, and this is more aggressive for acute
or chronic primary (compared to chronic secondary) pain pre-supply. In addition, this finding is
more prominent among patients who either do not have a history of surgeries/inpatient admission
pre-supply, or those who have such a history but have also shown a history of behavioral factors.
We also observe from our results that the rate of using non-pharmacologic treatments (NPHTs)
under human-based approaches is typically higher for patients with chronic pain and those with
a history of behavioral factors or surgeries/inpatient admissions pre-supply compared to other
patients. Overall, however, our algorithm-driven treatment policies often make use of NPHTs at
higher rates than the human-based approaches in the practice.7In particular, we observe that for
patients with an acute pain pre-supply, NPHTs should be used for a few months after the onset of
pain. In addition, this period should be extended for patients with a history of behavioral factors
or a surgery/inpatient admission pre-supply. Furthermore, when there is a history of behavioral
factors, NPHTs should be used over the whole time horizon.8However, when there is no such
history, NPHTs should be used towards the later stage of time horizon, because this is the time
when the dose of pharmacologic treatments (PHTs) is already tapered down. This implies a com-
plementary relationship between NPHTs and PHTs, which, in turn, substantiates the importance
of a comprehensive and orchestrated multi-modal pain management plan.
5.1.3. Risks of Events 1 and 2. We now compare the incidences of Events 1 and 2 under our
algorithm-driven treatment policies and the human-based approaches observed from the medical
practice. For the former, we follow the optimal treatments (discussed in
§
5.1.2) in our trained RNN
and obtain the risks of Events 1 and 2. For the latter, we monitor the incidence of Events 1 and 2
from our data, and create empirical 95% confidence intervals. Our results are presented in Figure 6.
From this figure, we make the following:
Observation 3. Compared to the current human-based practice, our treatment policies result
in a slightly higher risk of Event 1 but considerably lower risk of Event 2.
Observation 3 indicates that, under our treatment policies, the risk of Event 1 is slightly higher
than that observed from the practice. We also note that this gap is more prominent for patients
7One reason for the low utilization of NPHTs in the practice compared to our treatment policies could be the lack of
coverage for some of these services by insurance companies (see, e.g., Boloori et al. (2020b)). Nevertheless, we hope
that our findings here could contribute to the various efforts in adopting alternative treatments in pain management
(see, e.g., Goertz and George (2018) and Johns Hopkins (2018)).
8One exception is when there is no history of surgery or inpatient admission pre-supply, which makes the use of
NPHTs over the time horizon less frequent.
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
24 Article submitted to ; manuscript no.
0.05
0.10
0.15
0.20
0.25
0.30
0.35
P2, Practice
P2, Optimal
P1, Optimal
P1, Practice
Pre−supply pain: acute
Probability
0.05
0.10
0.15
0.20
0.25
0.30
0.35
P2, Practice
P2, Optimal
P1, Optimal
P1, Practice
Pre−supply pain: chronic primary
0.05
0.10
0.15
0.20
0.25
0.30
0.35
P2, Practice
P2, Optimal
P1, Optimal
P1, Practice
Pre−supply pain: chronic secondary
0.000
0.005
0.010
0.015
2 4 6 8 10 12
0.000
0.005
0.010
0.015
2 4 6 8 10 12
0.000
0.005
0.010
0.015
2 4 6 8 10 12
(a) Female, 40 years old, surgery/inpatient adm pre-supply: yes, history behavioral factors pre-supply: no
0.05
0.10
0.15
0.20
0.25
0.30
0.35
P2, Practice
P2, Optimal
P1, Optimal
P1, Practice
Pre−supply pain: acute
Probability
0.05
0.10
0.15
0.20
0.25
0.30
0.35
P2, Practice
P2, Optimal
P1, Optimal
P1, Practice
Pre−supply pain: chronic primary
0.05
0.10
0.15
0.20
0.25
0.30
0.35
P2, Practice
P2, Optimal
P1, Optimal
P1, Practice
Pre−supply pain: chronic secondary
0.000
0.005
0.010
0.015
0.020
2 4 6 8 10 12
0.000
0.005
0.010
0.015
0.020
2 4 6 8 10 12
0.000
0.005
0.010
0.015
0.020
2 4 6 8 10 12
(b) Female, 40 years old, surgery/inpatient adm pre-supply: yes, history behavioral factors pre-supply: yes
Figure 6 Likelihoods of incidence of Events 1 and 2 obtained from our rolling-horizon policy and the practice
Notes. P1 (P2): risk of Event 1 (2). x-axes represent time windows (month). Results from the rolling-horizon policy are obtained
under WTP =
$
10K. Shades represent 95% confidence intervals. Results for other cohorts are presented in Appendix F.3.
with (1) chronic primary pain who also have a record of surgeries/inpatient admission pre-supply,
or (2) no history of behavioral factors pre-supply who are either females in their 40’s with no
record of surgeries/inpatient admission pre-supply or males in their 50’s. However, unlike the risk
of Event 1, the risk of Event 2 resulted from our treatment policies is considerably lower than that
observed in the practice. The gap between the risk of Event 2 under the treatment policies we
obtain and those in the practice is particularly large among patients with (1) chronic primary pain
who are either males in their 50’s with a history of surgeries/inpatient admission and no history of
behavioral factors pre-supply or females in their 40’s, and (2) a history of behavioral factors pre-
supply who also have a record of surgeries/inpatient admission pre-supply. Overall, these results
indicate that our treatment policies are able to use patient-specific covariates to personalize the
balance between the risks of Events 1 and 2. On average, however, they improve performance
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
Article submitted to ; manuscript no. 25
compared to the observed practice by slightly increasing the risk of Event 1 while significantly
decreasing that of Event 2.
To gain deeper insights into changes that our algorithm-driven treatment policies would cause,
if adopted in practice, we also investigate the relative risks of experiencing Event 2 compared to
Event 1. That is, under both our treatment policies and the observed practice, and for different
cohorts of patients, we measure the number of patients that would have to experience undertreated
pain for one similar patient to experience opioid-related disorders at any time during the ther-
apeutic course. This sheds light on a quantity known as “number needed to undertreat,” which
the American Academy of Pain Medicine deems crucial for physicians in their daily practice (Carr
2016). Based on the results in Table 9, we observe that the relative risk of Event 2 to Event 1 in
the observed practice is typically higher among (1) patients with chronic primary pain pre-supply,
(2) females in their 40’s who have either both history of surgeries/inpatient admissions and
behavioral factors or neither of these records pre-supply, and (3) patients with a history of surg-
eries/inpatient admissions and behavioral factors pre-supply. In contrast, under our treatment
policies, the relative risk of Event 2 to Event 1 is higher among (1) patients with chronic pri-
mary pain pre-supply, (2) males in their 50’s, and (3) patients with history of surgeries/inpatient
admissions who do not have a history of behavioral factors pre-supply.
We also note that the main goal of the CDC guidelines has been centered around minimizing
the risk of Event 1. Our results discussed thus far indicate that, by and large, this is not a good
Table 9 Average (S.D.) of relative risk of Event 2 to Event 1 over the time horizon
Cohort Policy
Gender Age Surgery/inpatient
admission pre-supply
History of behavioral
factors pre-supply
Pain pathology
pre-supply Practice Optimal
Female 40’s
Yes
No
Acute 16.07 (7.01) 2.77 (0.51)
Chronic primary 21.10 (10.04) 5.03 (0.89)
Chronic secondary 17.24 (7.59) 5.56 (1.29)
Yes
Acute 19.13 (6.43) 2.77 (0.57)
Chronic primary 24.35 (8.97) 4.85 (1.30)
Chronic secondary 18.58 (8.46) 3.63 (0.81)
No
No
Acute 7.39 (4.12) 2.24 (0.56)
Chronic primary 13.50 (7.18) 3.99 (0.89)
Chronic secondary 10.76 (4.46) 4.41 (1.77)
Yes
Acute 7.75 (4.11) 7.62 (1.96)
Chronic primary 14.31 (6.87) 7.27 (1.45)
Chronic secondary 12.22 (4.64) 7.57 (2.10)
Male 50’s
Yes
No
Acute 18.32 (10.16) 3.31 (0.66)
Chronic primary 21.79 (11.39) 6.79 (1.17)
Chronic secondary 18.50 (7.37) 8.36 (2.09)
Yes
Acute 19.55 (8.63) 6.10 (1.87)
Chronic primary 23.32 (10.25) 7.96 (2.51)
Chronic secondary 17.55 (5.09) 6.33 (1.67)
No
No
Acute 4.97 (3.48) 2.47 (0.86)
Chronic primary 9.34 (4.44) 3.90 (1.22)
Chronic secondary 9.38 (2.50) 5.54 (1.99)
Yes
Acute 8.19 (4.19) 7.44 (1.89)
Chronic primary 12.81 (6.19) 9.74 (1.82)
Chronic secondary 17.83 (3.91) 9.80 (2.30)
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
26 Article submitted to ; manuscript no.
1
Treatments
253 64
Risk of Event 1
Risk of Event 2
𝑃
2
c
𝑃
2
,
1
p
𝑃
2
,
2
p
𝑃
2
,
3
p
𝑃
2
,
4
p
𝑃
2
,
6
p
𝑃
1
,
2
=
𝑃
1
(
100
𝑥
)
%
𝑃
1
,
1
p
𝑃
1
,
2
p
𝑃
1
,
3
p
𝑃
1
,
4
p
𝑃
1
cc
𝑃
1
,
6
p
500
1000
1500
2000
% reduction in P1
*
History behavioral factors pre−supply: No
Net Benefit Loss ($)
10 30 50 70 90
500
1000
1500
2000
% reduction in P1
*
Pain pre−supplyPain pre−supplyPain pre−supply
Acute
Chronic primary
Chronic secondary
History behavioral factors pre−supply: Yes
10 30 50 70 90
Figure 7 Impact of the reduction in the optimal risk of Event 1 (P
1)
Notes. (Left) A schematic illustration on how x% reduction in P
1alters the treatment (and hence the risks of Events 1-2). The
net benefit loss is the difference between the net benefits when following the altered (instead of optimal) treatments throughout
the time horizon. (Right) Net benefit loss. Results from the rolling-horizon policy are obtained for WTP =
$
10K and 50 year-old
males with a history of surgeries/inpatient admissions pre-supply. Results for other cohorts are presented in Appendix F.4.
objective to follow: what is needed is a careful and personalized balance between risks of Events 1
and 2. However, to gain a better understanding of the importance of this, we consider a hypothetical
policy in which the risk of Event 1 is fully reduced to 0.9Let P
1(t) be the risk of Event 1 in
window tresulted from the rolling-horizon policy and P1(t) = P
1(t)(100 x)% be the risk resulted
from an x% reduction in P
1(t). It should be note that any such reduction in the risk of Event 1
affects the risk of Event 2 (see Figure 7). To measure the overall impact, we let NBand NB
be the corresponding total monetary net benefit when using these risks over the time horizon,
respectively. In Figure 7, we show how the total net benefit loss, NBNB, changes based on
variations in x. If we follow treatments resulting in no risk of Event 1 over the therapeutic course
(i.e., a 100% reduction in P
1(t),tT), we find that, for patients with a history of behavioral
factors pre-supply, the society would lose up to
$
1,700 of net benefit per patient (this holds across
patients with different age, gender, and surgery records). Furthermore, when there is no history
of behavioral factors pre-supply, the net benefit loss would be up to
$
2,000 (
$
1,300) per patient
in the presence (absence) of surgeries/inpatient admission pre-supply. We also observe that the
net benefit loss is typically higher for patients with acute (compared to chronic) pain pre-supply,
although the former has a lower risk of Event 2 compared to the latter. This can be due to the
fact that the cost of experiencing Event 2 is higher when the pain-inducing condition is acute
(compared to chronic) (see Table 8).
Put together, our results in this section indicate that the CDC’s aim of solely reducing incidence
of opioid-related disorders can be harmful. Instead, we find that following algorithmic-based policies
that make a personalize balance between reducing the risk of opioid-related disorders and increasing
that of under-treated pain can go a long way.
9As Figure 6 shows, a drop in the risk of Event 1 would coincide with a much higher increase in the risk of Event 2.
Therefore, the drop in the cost of Event 1 incidence could come at the expense of more frequent visits due to higher
incidence of Event 2. This can negatively impact the net benefit gained.
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
Article submitted to ; manuscript no. 27
1500
2000
2500
3000
3500
Practice
Optimal
Pre−supply pain: acute
Expected cost ($)
Expected QALY (years)
0.57 0.58 0.59
1000
1200
1400
1600
1800
2000
Practice
Optimal
Pre−supply pain: chronic primary
Expected QALY (years)
0.57 0.58 0.59
1000
1200
1400
1600
1800
2000
Practice
Optimal
Pre−supply pain: chronic secondary
Expected QALY (years)
0.57 0.58 0.59
(a) Female, 40 years old, surgery/inpatient adm pre-supply: yes, history behavioral factors pre-supply: no
1500
2000
2500
3000
3500
4000
4500
Practice
Optimal
Pre−supply pain: acute
Expected cost ($)
Expected QALY (years)
0.56 0.57 0.58 0.59
1000
1500
2000
2500
Practice
Optimal
Pre−supply pain: chronic primary
Expected QALY (years)
0.56 0.57 0.58 0.59
1000
1500
2000
2500
Practice
Optimal
Pre−supply pain: chronic secondary
Expected QALY (years)
0.56 0.57 0.58 0.59
(b) Female, 40 years old, surgery/inpatient adm pre-supply: yes, history behavioral factors pre-supply: yes
Figure 8 QALY and cost gained per patient from our rolling-horizon policy and the practice
Notes. Results from the rolling-horizon policy are obtained under WTP =
$
10K. To account for the variability of the risks of
Events 1 and 2 under the practice, we randomly select 10,000 pairs of these risks from the corresponding confidence intervals
(shown in Figure 6) and report the (QALY,Cost) pair here. Results for other cohorts are presented in Appendix F.5.
5.1.4. Who Benefits the Most? We now investigate the type of patients that would benefit
most from our treatment policies compared to the practice of physicians. To this end, we make
use of the risks of Events 1 and 2 (illustrated in Figure 6) to measure the QALY and cost gained.
Based on the results in Figure 8, we make the following observation.
Observation 4. By following our algorithmic-based treatment policies, patients with the follow-
ing conditions pre-supply would benefit most in terms of both QALY and cost: acute or chronic
primary pain and a history of behavioral factors or surgery/inpatient admission. This fining holds
across patients with different age or gender.
As discussed earlier under Observation 3, the risk of Event 2 (1) is lower (higher) under our
treatment policies compared to the observed practice of physicians. Moreover, across these policies,
the difference in the risk of Event 2 (1) is more (less) prominent for patients with acute or chronic
primary pain compared to those with chronic secondary pain pre-supply. A similar result holds for
patients with a record of surgery/inpatient admission or behavioral factors pre-supply.
Author: Algorithm-Based vs. Human-Based Management of Pain Treatments
28 Article submitted to ; manuscript no.
5.2. Robustness Checks
We conduct a variety of robustness checks on our baseline parameters (Table 8) to measure the
sensitivity of our results to various potential misspecifications. In what follows, we focus on pre-
senting our robustness checks with respect to our cost-effectiveness results. We do so because many
of our main results are driven by the fact that our framework allows obtaining treatment policies
that are not only more personalized, and hence, effective in improving patient outcomes (improving
QALY), but are also less costly than CDC guidelines or the observed practice of human-experts.
Thus, if our sensitivity results indicate that our treatment policies remain cost-effective even when
we alter the estimated parameters, it can give us further confidence about the validity of our main
findings.
qol scores. Compared to the values under the baseline setting, we consider alternative scenarios
where the qol scores for Events 1 and 2 are perturbed (increased or decreased). Based on the results
presented in Table 10, we observe that the treatment policies we obtain remain cost-effective even
when we alter the qol scores. The cost-effectiveness of our policies compared to the guidelines
in particular improve under two circumstances: when the qol score for experiencing Event 1 (2)
increases (decreases), which is the case when the severity of opioid-related disorders decreases
(e.g., opioid dependence compared to abuse/poisoning) and/or when the level of undertreated pain
increases.
Costs. Similarly, we consider different scenarios for the cost of experiencing Events 1 and 2 by
altering the estimated values we used in our baseline setting. From the results presented in Table 10,
we find that our treatment policies remain cost-effective compared to the CDC guideline even
when the estimated