Dimitris Bertsimas’s research while affiliated with Massachusetts Institute of Technology and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (25)


Patient Outcome Predictions Improve Operations at a Large Hospital Network
  • Preprint

May 2023

·

46 Reads

·

3 Citations

Liangyuan Na

·

Kimberly Villalobos Carballo

·

·

[...]

·

Dimitris Bertsimas

Problem definition: Access to accurate predictions of patients' outcomes can enhance medical staff's decision-making, which ultimately benefits all stakeholders in the hospitals. A large hospital network in the US has been collaborating with academics and consultants to predict short-term and long-term outcomes for all inpatients across their seven hospitals. Methodology/results: We develop machine learning models that predict the probabilities of next 24-hr/48-hr discharge and intensive care unit transfers, end-of-stay mortality and discharge dispositions. All models achieve high out-of-sample AUC (75.7%-92.5%) and are well calibrated. In addition, combining 48-hr discharge predictions with doctors' predictions simultaneously enables more patient discharges (10%-28.7%) and fewer 7-day/30-day readmissions (p-value <0.001<0.001). We implement an automated pipeline that extracts data and updates predictions every morning, as well as user-friendly software and a color-coded alert system to communicate these patient-level predictions (alongside explanations) to clinical teams. Managerial implications: Since we have been gradually deploying the tool, and training medical staff, over 200 doctors, nurses, and case managers across seven hospitals use it in their daily patient review process. We observe a significant reduction in the average length of stay (0.67 days per patient) following its adoption and anticipate substantial financial benefits (between \55 and \72 million annually) for the healthcare system.


Dynamic optimization with side information

March 2022

·

47 Reads

·

40 Citations

European Journal of Operational Research

We develop a tractable and flexible data-driven approach for incorporating side information into multi-stage stochastic programming. The proposed framework uses predictive machine learning methods (such as k-nearest neighbors, kernel regression, and random forests) to weight the relative importance of various data-driven uncertainty sets in a robust optimization formulation. Through a novel measure concentration result for a class of supervised machine learning methods, we prove that the proposed approach is asymptotically optimal for multi-period stochastic programming with side information. We also describe a general-purpose approximation for these optimization problems, based on overlapping linear decision rules, which is computationally tractable and produces high-quality solutions for dynamic problems with many stages. Across a variety of multi-stage and single-stage examples in inventory management, finance, and shipment planning, our method achieves improvements of up to 15% over alternatives and requires less than one minute of computation time on problems with twelve stages.


Global Optimization via Optimal Decision Trees

February 2022

·

16 Reads

The global optimization literature places large emphasis on reducing intractable optimization problems into more tractable structured optimization forms. In order to achieve this goal, many existing methods are restricted to optimization over explicit constraints and objectives that use a subset of possible mathematical primitives. These are limiting in real-world contexts where more general explicit and black box constraints appear. Leveraging the dramatic speed improvements in mixed-integer optimization (MIO) and recent research in machine learning, we propose a new method to learn MIO-compatible approximations of global optimization problems using optimal decision trees with hyperplanes (OCT-Hs). This constraint learning approach only requires a bounded variable domain, and can address both explicit and inexplicit constraints. We solve the MIO approximation efficiently to find a near-optimal, near-feasible solution to the global optimization problem. We further improve the solution using a series of projected gradient descent iterations. We test the method on a number of numerical benchmarks from the literature as well as real-world design problems, demonstrating its promise in finding global optima efficiently.


Benchmarking in Congenital Heart Surgery Using Machine Learning-Derived Optimal Classification Trees

November 2021

·

37 Reads

·

12 Citations

World Journal for Pediatric and Congenital Heart Surgery

Background: We have previously shown that the machine learning methodology of optimal classification trees (OCTs) can accurately predict risk after congenital heart surgery (CHS). We have now applied this methodology to define benchmarking standards after CHS, permitting case-adjusted hospital-specific performance evaluation. Methods: The European Congenital Heart Surgeons Association Congenital Database data subset (31 792 patients) who had undergone any of the 10 “benchmark procedure group” primary procedures were analyzed. OCT models were built predicting hospital mortality (HM), and prolonged postoperative mechanical ventilatory support time (MVST) or length of hospital stay (LOS), thereby establishing case-adjusted benchmarking standards reflecting the overall performance of all participating hospitals, designated as the “virtual hospital.” These models were then used to predict individual hospitals’ expected outcomes (both aggregate and, importantly, for risk-matched patient cohorts) for their own specific cases and case-mix, based on OCT analysis of aggregate data from the “virtual hospital.” Results: The raw average rates were HM = 4.4%, MVST = 15.3%, and LOS = 15.5%. Of 64 participating centers, in comparison with each hospital's specific case-adjusted benchmark, 17.0% statistically (under 90% confidence intervals) overperformed and 26.4% underperformed with respect to the predicted outcomes for their own specific cases and case-mix. For MVST and LOS, overperformers were 34.0% and 26.4%, and underperformers were 28.3% and 43.4%, respectively. OCT analyses reveal hospital-specific patient cohorts of either overperformance or underperformance. Conclusions: OCT benchmarking analysis can assess hospital-specific case-adjusted performance after CHS, both overall and patient cohort-specific, serving as a tool for hospital self-assessment and quality improvement.



Learning Mixed-Integer Convex Optimization Strategies for Robot Planning and Control

April 2020

·

64 Reads

Mixed-integer convex programming (MICP) has seen significant algorithmic and hardware improvements with several orders of magnitude solve time speedups compared to 25 years ago. Despite these advances, MICP has been rarely applied to real-world robotic control because the solution times are still too slow for online applications. In this work, we extend the machine learning optimizer (MLOPT) framework to solve MICPs arising in robotics at very high speed. MLOPT encodes the combinatorial part of the optimal solution into a strategy. Using data collected from offline problem solutions, we train a multiclass classifier to predict the optimal strategy given problem-specific parameters such as states or obstacles. Compared to previous approaches, we use task-specific strategies and prune redundant ones to significantly reduce the number of classes the predictor has to select from, thereby greatly improving scalability. Given the predicted strategy, the control task becomes a small convex optimization problem that we can solve in milliseconds. Numerical experiments on a cart-pole system with walls, a free-flying space robot and task-oriented grasps show that our method provides not only 1 to 2 orders of magnitude speedups compared to state-of-the-art solvers but also performance close to the globally optimal MICP solution.



From Predictive to Prescriptive Analytics

August 2019

·

595 Reads

·

622 Citations

Management Science

We combine ideas from machine learning (ML) and operations research and management science (OR/MS) in developing a framework, along with specific methods, for using data to prescribe optimal decisions in OR/MS problems. In a departure from other work on data-driven optimization, we consider data consisting, not only of observations of quantities with direct effect on costs/revenues, such as demand or returns, but also predominantly of observations of associated auxiliary quantities. The main problem of interest is a conditional stochastic optimization problem, given imperfect observations, where the joint probability distributions that specify the problem are unknown. We demonstrate how our proposed methods are generally applicable to a wide range of decision problems and prove that they are computationally tractable and asymptotically optimal under mild conditions, even when data are not independent and identically distributed and for censored observations. We extend these to the case in which some decision variables, such as price, may affect uncertainty and their causal effects are unknown. We develop the coefficient of prescriptiveness P to measure the prescriptive content of data and the efficacy of a policy from an operations perspective. We demonstrate our approach in an inventory management problem faced by the distribution arm of a large media company, shipping 1 billion units yearly. We leverage both internal data and public data harvested from IMDb, Rotten Tomatoes, and Google to prescribe operational decisions that outperform baseline measures. Specifically, the data we collect, leveraged by our methods, account for an 88% improvement as measured by our coefficient of prescriptiveness. This paper was accepted by Noah Gans, optimization.


Online Mixed-Integer Optimization in Milliseconds

July 2019

·

151 Reads

We propose a method to solve online mixed-integer optimization (MIO) problems at very high speed using machine learning. By exploiting the repetitive nature of online optimization, we are able to greatly speedup the solution time. Our approach encodes the optimal solution into a small amount of information denoted as strategy using the Voice of Optimization framework proposed in [BS18]. In this way the core part of the optimization algorithm becomes a multiclass classification problem which can be solved very quickly. In this work we extend that framework to real-time and high-speed applications focusing on parametric mixed-integer quadratic optimization (MIQO). We propose an extremely fast online optimization algorithm consisting of a feedforward neural network (NN) evaluation and a linear system solution where the matrix has already been factorized. Therefore, this online approach does not require any solver nor iterative algorithm. We show the speed of the proposed method both in terms of total computations required and measured execution time. We estimate the number of floating point operations (flops) required to completely recover the optimal solution as a function of the problem dimensions. Compared to state-of-the-art MIO routines, the online running time of our method is very predictable and can be lower than a single matrix factorization time. We benchmark our method against the state-of-the-art solver Gurobi obtaining from two to three orders of magnitude speedups on benchmarks with real-world data.


The K-Server Problem via a Modern Optimization Lens

July 2019

·

32 Reads

·

3 Citations

European Journal of Operational Research

We consider the well-known K-server problem from the perspective of mixed integer, robust and adaptive optimization. We propose a new tractable mixed integer linear formulation of the K-server problem that incorporates both information from the past and uncertainty about the future. By combining ideas from classical online algorithms developed in the computer science literature and robust and adaptive optimization developed in the operations research literature we propose a new method that (a) is computationally tractable, (b) almost always outperforms all other methods in numerical experiments, and (c) is stable with respect to potential errors in the assumptions about the future.


Citations (18)


... Similar benefits have been documented in previous research. For example, Na and colleagues reported that ML models could enhance decision-making and reduce average hospital stays by predicting patient outcomes in a large hospital network [30]. Additionally, Bishara et al. have shown that the application of ML in acute care settings can improve hospital operational management and patient outcomes using supervised, unsupervised, and reinforcement learning algorithms [31]. ...

Reference:

Leverage machine learning to identify key measures in hospital operations management: a retrospective study to explore feasibility and performance of four common algorithms
Patient Outcome Predictions Improve Operations at a Large Hospital Network
  • Citing Preprint
  • May 2023

... Representing a range of possible data distributions in these sets -often defined using statistical measures such as moment constraints or Wasserstein distances-has provided a potential trade-off between the stochastic and robust frameworks (Delage and Ye, 2010;Wiesemann et al., 2014). Contextual information has been incorporated into DRO to enhance the modeling of the ambiguity existing in the underlying probability distribution of the uncertainty: (Bertsimas et al., 2023a) combines the prescriptive analytics approach in (Bertsimas and Kallus, 2020) with the RO techniques in (Bertsimas et al., 2022(Bertsimas et al., , 2023b to extend these works and make a weighted average guided by the relevance of the training samples to the new side information. (Esteban-Pérez and Morales, 2022) addresses the problem of constructing an ambiguity set by exploiting the connection between trimmings and the partial mass problem (restricted to the p−Wasserstein metric) to immunize the decision against the error incurred in the process of inferring conditional information from joint data. ...

Dynamic optimization with side information
  • Citing Article
  • March 2022

European Journal of Operational Research

... A meta-analysis to assess and compare the performance of the models was not possible due to missing data, yet we can still observe the robust predictive performance exhibited by different artificial intelligence models, regardless of the outcomes they were designed to predict. The models included focused on the prediction of mortality and survival (n = 16) [27, 29, 31, 33, 37, 40, 46-48, 52-55, 60], prolonged length of hospital or ICU stay (n = 7) [29,40,43,46,47,57,58], postoperative complications (n = 6) [34,35,41,42,44,59] prolonged mechanical ventilatory support time (n = 4) [40,47,57,59], with additional focus on specific outcomes such as periventricular leucomalacia [32,39], acute kidney injury [56], malnutrition [50]. The AUCs of the models ranged between 0.52 to 0.997, with most models achieving an AUC above 0.7, highlighting the predictive potential of artificial intelligence in congenital heart surgery. ...

Benchmarking in Congenital Heart Surgery Using Machine Learning-Derived Optimal Classification Trees
  • Citing Article
  • November 2021

World Journal for Pediatric and Congenital Heart Surgery

... In order to incorporate the contextual information into the decision-making problem, [4] employ a parameterized mapping from contextual information to inventory decision and select parameters to achieve the best empirical performance based on the available data. This method, termed 'decision rule approach', is also applied to portfolio selection problem by [5] and extended to general stochastic programming problems using reproducing kernel Hilbert spaces in [10] and a non-convex piecewise affine decision rule in [70]. [68] explore a distributionally robust framework for newsvendor problem and identify a class of nonparametric policies that interpolate an optimal in-sample policy to unobserved feature values in a specific manner. ...

Data-Driven Optimization: A Reproducing Kernel Hilbert Space Approach
  • Citing Article
  • March 2021

Operations Research

... In addition to decision tree-based algorithms, classification algorithms such as support vector machines (SVMs), Naïve Bayes, and K-nearest neighbours (KNNs) are extensively employed for strategic decision-making purposes. As defined by some scholars (Ferreira et al. 2016;Bertsimas and Kallus, 2020) these ML methods are instrumental in predictive scenarios, including: ...

From Predictive to Prescriptive Analytics
  • Citing Article
  • August 2019

Management Science

... The major features that characterize humanitarian logistics operations are the volatile, uncertain, complex, and ambiguous nature of disasters. The relief operations studied in the literature include vehicle routing [28], inventory allocation [29], inventory management [30], aid vehicle dispatch [31], and inventory distribution problems [32]. These problems were solved using simulations, optimization, and meta-heuristics. ...

Robust and Stochastic Formulations for Ambulance Deployment and Dispatch
  • Citing Article
  • May 2019

European Journal of Operational Research

... Public data from platforms such as Waze, Uber, or Google Maps usually provide only location and departure/arrival times. In these cases, as mentioned in [1], advanced machine learning techniques, which can account for complex interactions and non-linearities in traffic patterns, can be trained on historical traffic data to predict O-D travel times. ...

Travel Time Estimation in the Age of Big Data
  • Citing Article
  • March 2019

Operations Research

... This alignment with peak hours increases the operational costs for schools due to higher energy rates. Previous research has extensively explored the impact of school start times on student performance [15,16], health [16][17][18], and transportation logistics [19], with a significant focus on adolescent sleep patterns and academic outcomes. Although substantial research has been conducted on the impacts of school start times in other areas, the energy-related implications of these schedules have not been explored as thoroughly, leaving a gap in understanding how changes in school schedules could affect energy consumption and cost efficiency. ...

Optimizing schools’ start time and bus routes
  • Citing Article
  • March 2019

Proceedings of the National Academy of Sciences

... Studies (Kondor et al., 2022;Vazifeh et al., 2018) show that with non-coordinating greedy policies, each additional ride-hailing company in the market can largely increase the total number of circulating vehicles than necessary. Heuristic-based methods (Bertsekas, 1979;Bertsimas et al., 2019;Croes, 1958) often generate myopic policies due to the lack of consideration for future demand. RL methods have been proposed in both offline (Ulmer et al., 2019;Farazi et al., 2021) and online (Silver & Veness, 2010;Somani et al., 2013;Bent & Van Hentenryck, 2004) regimes; however, they either lack robustness to distribution shifts or are computationally expensive. ...

Online Vehicle Routing: The Edge of Optimization in Large-Scale Applications
  • Citing Article
  • January 2019

Operations Research