Sandjai Bhulai

Sandjai Bhulai
  • Professor
  • Professor (Full) at Vrije Universiteit Amsterdam

About

186
Publications
56,755
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,087
Citations
Introduction
Sandjai Bhulai is a ​full professor of Business Analytics at Vrije Universiteit Amsterdam. He studied "Mathematics" and "Business Mathematics and Informatics", and obtained a Ph.D.​ on Markov decision processes for the control of complex, high-dimensional systems. He is a ​co-founder of the Amsterdam Center for Business Analytics (ACBA), co-founder of the postgraduate program​ Business Analytics / Data Science. Sandjai's research is on the interface of mathematics, computer science, and operations management. His specialization is in decision making under uncertainty, optimization, data science, and business analytics. His current research projects focus on HR analytics, social media analytics, predictive analytics, dynamic pricing, and planning and scheduling in complex systems.
Current institution
Vrije Universiteit Amsterdam
Current position
  • Professor (Full)

Publications

Publications (186)
Preprint
Full-text available
The sustainability of the ocean ecosystem is threatened by increased levels of sound pollution, making monitoring crucial to understand its variability and impact. Passive acoustic monitoring (PAM) systems collect a large amount of underwater sound recordings, but the large volume of data makes manual analysis impossible, creating the need for auto...
Preprint
Full-text available
The increasing level of sound pollution in marine environments poses an increased threat to ocean health, making it crucial to monitor underwater noise. By monitoring this noise, the sources responsible for this pollution can be mapped. Monitoring is performed by passively listening to these sounds. This generates a large amount of data records, ca...
Article
Full-text available
Evaluation metrics provide a means for quantifying and comparing performances of supervised learning models, but drawing meaningful conclusions from acquired scores requires a contextual framework. Our paper addresses this by introducing the Dutch scaler (DS), a novel performance indicator for binary classification models. It quantifies a model’s l...
Preprint
Power grid operation is becoming increasingly complex due to the rising integration of renewable energy sources and the need for more adaptive control strategies. Reinforcement Learning (RL) has emerged as a promising approach to power network control (PNC), offering the potential to enhance decision-making in dynamic and uncertain environments. Th...
Article
Full-text available
Background Psychosocial autopsy is a retrospective study of suicide, aimed to identify emerging themes and psychosocial risk factors. It typically relies heavily on qualitative data from interviews or medical documentation. However, qualitative research has often been scrutinized for being prone to bias and is notoriously time- and cost-intensive....
Preprint
In the era of telecommunications, the increasing demand for complex and specialized communication systems has led to a focus on improving physical layer communications. Artificial intelligence (AI) has emerged as a promising solution avenue for doing so. Deep neural receivers have already shown significant promise in improving the performance of co...
Article
Full-text available
As telecommunication systems evolve to meet increasing demands, integrating deep neural networks (DNNs) has shown promise in enhancing performance. However, the trade-off between accuracy and flexibility remains challenging when replacing traditional receivers with DNNs. This paper introduces a novel probabilistic framework that allows a single DNN...
Conference Paper
Background Psychological autopsy is essential to establish theories, explore trends and identify previously unexplored psychosocial risk factors in suicide research. However, qualitative research has been scrutinized for being prone to interpretation bias, problems with accuracy, challenges to reproducibility, and is very time- and cost intensive....
Article
Full-text available
Novel prediction methods should always be compared to a baseline to determine their performance. Without this frame of reference, the performance score of a model is basically meaningless. What does it mean when a model achieves an $F_1$ of 0.8 on a test set? A proper baseline is, therefore, required to evaluate the ‘goodness’ of a performance scor...
Preprint
Full-text available
The growing privacy concerns surrounding face image data demand new techniques that can guarantee user privacy. One such face recognition technique that claims to achieve better user privacy is Federated Face Recognition (FRR), a subfield of Federated Learning (FL). However, FFR faces challenges due to the heterogeneity of the data, given the large...
Preprint
Full-text available
As telecommunication systems evolve to meet increasing demands, integrating deep neural networks (DNNs) has shown promise in enhancing performance. However, the trade-off between accuracy and flexibility remains challenging when replacing traditional receivers with DNNs. This paper introduces a novel probabilistic framework that allows a single DNN...
Preprint
Full-text available
Encryption on the internet with the shift to HTTPS has been an important step to improve the privacy of internet users. However, there is an increasing body of work about extracting information from encrypted internet traffic without having to decrypt it. Such attacks bypass security guarantees assumed to be given by HTTPS and thus need to be under...
Article
A challenge in same-day delivery operations is that delivery requests are typically not known beforehand, but are instead revealed dynamically during the day. This uncertainty introduces a trade-off between dispatching vehicles to serve requests as soon as they are revealed to ensure timely delivery and delaying the dispatching decision to consolid...
Preprint
BACKGROUND To provide optimal care in a suicide prevention helpline, it is important to know what contributes to positive or negative effects on help seekers. Helplines can often be contacted through chat services, which produce large amounts of text data, to use in large-scale analysis. OBJECTIVE We trained a machine learning classification model...
Article
Full-text available
Background For the provision of optimal care in a suicide prevention helpline, it is important to know what contributes to positive or negative effects on help seekers. Helplines can often be contacted through text-based chat services, which produce large amounts of text data for use in large-scale analysis. Objective We trained a machine learning...
Preprint
Full-text available
Recent challenges in operating power networks arise from increasing energy demands and unpredictable renewable sources like wind and solar. While reinforcement learning (RL) shows promise in managing these networks, through topological actions like bus and line switching, efficiently handling large action spaces as networks grow is crucial. This pa...
Preprint
Full-text available
A challenge in same-day delivery operations is that delivery requests are typically not known beforehand, but are instead revealed dynamically during the day. This uncertainty introduces a trade-off between dispatching vehicles to serve requests as soon as they are revealed to ensure timely delivery, and delaying the dispatching decision to consoli...
Article
Full-text available
Individuals with autism increasingly enroll in universities, but little is known about predictors for their success. This study developed predictive models for the academic success of autistic bachelor students (N = 101) in comparison to students with other health conditions (N = 2465) and students with no health conditions (N = 25,077). We applied...
Article
Full-text available
In many machine learning applications, the labeling of datasets is done by human experts, which is usually time-consuming in cases of large data sets. This raises the need for methods to make optimal use of the human expert by selecting model instances for which the expert opinion is of most added value. This paper introduces the problem of active...
Article
Full-text available
Samenvatting Inleiding Voor suïcidepreventie is het van belang om groepen met een verhoogd risico op suïcide zo goed mogelijk te identificeren. Tot nog toe is er weinig bekend over interacties van meerdere risicofactoren. Machine learning-methoden bieden nieuwe mogelijkheden voor flexibel, datagedreven, hypothesevrij en robuust onderzoek naar de i...
Article
Full-text available
Background Each year, many help seekers in need contact health helplines for mental support. It is crucial that they receive support immediately, and that waiting times are minimal. In order to minimize delay, helplines must have adequate staffing levels, especially during peak hours. This has raised the need for means to predict the call and chat...
Article
Before any binary classification model is taken into practice, it is important to validate its performance on a proper test set. Without a frame of reference given by a baseline method, it is impossible to determine if a score is “good” or “bad.” The goal of this paper is to examine all baseline methods that are independent of feature values and de...
Article
Measuring and quantifying dependencies between random variables (RVs) can give critical insights into a dataset. Typical questions are: ‘Do underlying relationships exist?’, ‘Are some variables redundant?’, and ‘Is some target variable Y highly or weakly dependent on variable X ?’ Interestingly, despite the evident need for a general-purpose measur...
Chapter
This paper presents a simulation-optimization approach to strategic workforce planning based on deep reinforcement learning. A domain expert expresses the organization’s high-level, strategic workforce goals over the workforce composition. A policy that optimizes these goals is then learned in a simulation-optimization loop. Any suitable simulator...
Article
Background: Targeted interventions for suicide prevention rely on adequate identification of groups at elevated risk. Several risk factors for suicide are known, but little is known about the interactions between risk factors. Interactions between risk factors may aid in detecting more specific sub-populations at higher risk. Methods: Here, we u...
Preprint
Over the past few years, the use of machine learning models has emerged as a generic and powerful means for prediction purposes. At the same time, there is a growing demand for interpretability of prediction models. To determine which features of a dataset are important to predict a target variable $Y$, a Feature Importance (FI) method can be used....
Preprint
Before any binary classification model is taken into practice, it is important to validate its performance on a proper test set. Without a frame of reference given by a baseline method, it is impossible to determine if a score is `good' or `bad'. The goal of this paper is to examine all baseline methods that are independent of feature values and de...
Article
Full-text available
Background For mechanically ventilated critically ill COVID-19 patients, prone positioning has quickly become an important treatment strategy, however, prone positioning is labor intensive and comes with potential adverse effects. Therefore, identifying which critically ill intubated COVID-19 patients will benefit may help allocate labor resources....
Preprint
Full-text available
This paper presents a simulation-optimization approach to strategic workforce planning based on deep reinforcement learning. A domain expert expresses the organization's high-level, strategic workforce goals over the workforce composition. A policy that optimizes these goals is then learned in a simulation-optimization loop. Any suitable simulator...
Article
Vehicle damages are increasingly becoming a liability for shared mobility services. The large number of handovers between drivers demands for an accurate and fast inspection system, which locates small damages and classifies these into the correct damage category. To address this, a damage detection model is developed to locate vehicle damages and...
Article
One of the reasons that the deployment of network intrusion detection methods falls short is the lack of realistic labeled datasets, which makes it challenging to develop and compare techniques. It is caused by the large amounts of effort that it takes for a cyber expert to classify network connections. This has raised the need for methods that lea...
Article
Full-text available
StyleGAN2 is able to generate very realistic and high-quality faces of humans using a training set (FFHQ). Instead of using one of the many commonly used metrics to evaluate the performance of a face generator (e.g., FID, IS and P&R), this paper uses a more humanlike approach providing a different outlook on the performance of StyleGAN2. The genera...
Preprint
Full-text available
The RangL project hosted by The Alan Turing Institute aims to encourage the wider uptake of reinforcement learning by supporting competitions relating to real-world dynamic decision problems. This article describes the reusable code repository developed by the RangL team and deployed for the 2022 Pathways to Net Zero Challenge, supported by the UK...
Preprint
Full-text available
Background Each year, many help seekers in need contact health helplines for mental support. For this, it is crucial that they receive support immediately, and that waiting times are minimal. In order to minimize delay, it is necessary that helplines have adequate staffing levels, especially during peak hours. This has raised the need for means to...
Preprint
Full-text available
The RangL project hosted by The Alan Turing Institute aims to encourage the wider uptake of reinforcement learning by supporting competitions relating to real-world dynamic decision problems. This article describes the reusable code repository developed by the RangL team and deployed for the 2022 Pathways to Net Zero Challenge, supported by the UK...
Article
Full-text available
Single-view computer vision models for vehicle damage inspection often suffer from strong light reflections. To resolve this, multiple images under various viewpoints can be used. However, multiple views increase the complexity as multi-view training data, specialized models, and damage re-identification over different views are required. In additi...
Article
Objectives The long waiting times for nursing homes can be reduced by applying advanced waiting-line management. In this article, we implement a preference-based allocation model for older adults to nursing homes, evaluate the performance in a simulation setting for 2 case studies, and discuss the implementation in practice. Design Simulation stud...
Preprint
Full-text available
Novel prediction methods should always be compared to a baseline to know how well they perform. Without this frame of reference, the performance score of a model is basically meaningless. What does it mean when a model achieves an $F_1$ of 0.8 on a test set? A proper baseline is needed to evaluate the `goodness' of a performance score. Comparing wi...
Preprint
Full-text available
Measuring and quantifying dependencies between random variables (RV's) can give critical insights into a data-set. Typical questions are: `Do underlying relationships exist?', `Are some variables redundant?', and `Is some target variable $Y$ highly or weakly dependent on variable $X$?' Interestingly, despite the evident need for a general-purpose m...
Article
Full-text available
Background Preventatives measures to combat the spread of COVID− 19 have introduced social isolation, loneliness and financial stress. This study aims to identify whether the COVID-19 pandemic is related to changes in suicide-related problems for help seekers on a suicide prevention helpline. Methods A retrospective cohort study was conducted usin...
Article
Full-text available
Individuals with autism increasingly enroll in universities, but researchers know little about how their study progresses over time towards degree completion. This exploratory population study uses structural equation modeling to examine patterns in study progression and degree completion of bachelor’s students with autism spectrum disorder ( n = 1...
Article
Emergency response fleets often have to simultaneously perform two types of tasks: (1) urgent tasks requiring immediate action, and (2) non-urgent preventive maintenance tasks that can be scheduled upfront. In Huizing et al. (2020), Huizing et al. proposed the Median Routing Problem (MRP) to optimally schedule agents to a given set of non-urgent ta...
Chapter
This study aims to identify the patterns of behavior which underlie human mobility. More specifically, we compare commuters who drive in a car with those who use the train in the same geographic region of the Netherlands. We try to understand the mode choices of the commuters based on three factors: the cost of the transport mode, the CO\(_2\) emis...
Article
Full-text available
The Covid-19 pandemic has brought forth a major landscape shock in the mobility sector. Due to its recentness, researchers have just started studying and understanding the implications of this crisis on mobility. We contribute by combining mobility data from various sources to bring a novel angle to understanding mobility patterns during Covid-19....
Article
Full-text available
Objectives We investigate the spatio-temporal variation of monthly residential burglary frequencies across neighborhoods as a function of crime generators, street network features and temporally and spatially lagged burglary frequencies. In addition, we evaluate the performance of the model as a forecasting tool.Methods We analyze 48 months of poli...
Preprint
Full-text available
This paper provides a review of the job recommender system (JRS) literature published in the past decade (2011-2021). Compared to previous literature reviews, we put more emphasis on contributions that incorporate the temporal and reciprocal nature of job recommendations. Previous studies on JRS suggest that taking such views into account in the de...
Preprint
Full-text available
Given the vital importance of search engines to find digital information, there has been much scientific attention on how users interact with search engines, and how such behavior can be modeled. Many models on user - search engine interaction, which in the literature are known as click models, come in the form of Dynamic Bayesian Networks. Althoug...
Article
Full-text available
Background Suicide is a complex issue. Due to the relative rarity of the event, studies into risk factors are regularly limited by sample size or biased samples. The aims of the study were to find risk factors for suicide that are robust to intercorrelation, and which were based on a large and unbiased sample. Methods Using a training set of 5854...
Article
Full-text available
The size of container ships and the number of containers being transshipped at container terminals have steadily increased over the years. Consequently, it is important to make efficient use of the hinterland capacity. A concept that is used to do this is synchromodal transportation, in which at the very last moment the mode of transportation for a...
Preprint
Full-text available
Over the past decade, the advent of cybercrime has accelarated the research on cybersecurity. However, the deployment of intrusion detection methods falls short. One of the reasons for this is the lack of realistic evaluation datasets, which makes it a challenge to develop techniques and compare them. This is caused by the large amounts of effort i...
Article
Full-text available
Background The identification of risk factors for adverse outcomes and prolonged intensive care unit (ICU) stay in COVID-19 patients is essential for prognostication, determining treatment intensity, and resource allocation. Previous studies have determined risk factors on admission only, and included a limited number of predictors. Therefore, usin...
Article
In this study, the classification of white cabbage seedling images is modeled with convolutional neural networks. We focus on a dataset that tracks the seedling growth over a period of 14 days, where photos were taken at four specific moments. The dataset contains 13,200 individual seedlings with corresponding labels and was retrieved from Bejo, a...
Conference Paper
Full-text available
The “almost-squares in almost-squares” (Asqas) problem is a rectangle packing problem in which a series of almostsquares (rectangles of dimensions n×(n+ 1)) needs to be placed inside an almost-square frame without open areas or overlaps. Asqas-34, consisting of almost-squares 1 × 2, 2 × 3 ... 34 × 35, remains unsolved. This paper shows Asqas-34 is...
Preprint
Full-text available
We propose a new model to assess the mastery level of a given skill efficiently. The model, called Bayesian Adaptive Mastery Assessment (BAMA), uses information on the accuracy and the response time of the answers given and infers the mastery at every step of the assessment. BAMA balances the length of the assessment and the certainty of the master...
Preprint
Full-text available
We introduce Invertible Dense Networks (i-DenseNets), a more parameter efficient alternative to Residual Flows. The method relies on an analysis of the Lipschitz continuity of the concatenation in DenseNets, where we enforce invertibility of the network by satisfying the Lipschitz constant. We extend this method by proposing a learnable concatenati...
Article
Full-text available
Urban planning can benefit tremendously from a better understanding of where, when, why, and how people travel. Through advances in technology, detailed data on the travel behavior of individuals has become available. This data can be leveraged to understand why one prefers one mode of transportation over another one. In this paper, we analyze a un...
Preprint
Full-text available
We introduce Invertible Dense Networks (i-DenseNets), a more parameter efficient alternative to Residual Flows. The method relies on an analysis of the Lipschitz continuity of the concatenation in DenseNets, where we enforce the invertibility of the network by satisfying the Lipschitz constraint. Additionally, we extend this method by proposing a l...
Article
To pick up a container from a container terminal, other containers may need to be relocated to other positions. In practice, these relocation moves are usually done when it is busy at a terminal. However, if the crane is idle for some amount of time, it may be more efficient to execute some pre-processing moves to reduce the number of future reloca...
Article
Full-text available
Background: Autistic individuals' enrollment in universities is increasing, but we know little about their study progress over time. Many of them have poor degree completion in comparison to students with other disabilities. However, longitudinal studies on study progression over time of autistic students (AS) in comparison to their peers are absen...
Article
Full-text available
In 2000 to 2016 the highest number of suicides among Dutch youths under 20 in any given year was 58 in 2013. In 2017 this number increased to 81 youth suicides. To get more insight in what types of youths died by suicide, particularly in recent years (2013–2017) we looked at micro-data of Statistics Netherlands and counted suicides among youths til...
Article
This paper studies a setting in emergency logistics where emergency responders must also perform a set of known, non-emergency jobs in the network when there are no active emergencies going on. These jobs typically have a preventive function, and allow the responders to use their idle time much more productively than in the current standard. When a...
Article
This paper introduces a general model of a single-lane roundabout, represented as a circular lattice that consists of L cells, with Markovian traffic dynamics. Vehicles enter the roundabout via on-ramp queues that have stochastic arrival processes, remain on the roundabout a random number of cells, and depart via off-ramps. Importantly, the model d...
Article
Full-text available
Urban planning can benefit tremendously from a better understanding of where, when, why, and how people travel. Through advances in technology, detailed data on the travel behavior of individuals has become available. This data can be leveraged to understand why one prefers one mode of transportation over another one. In this paper, we analyze a un...
Article
In container terminals, containers are often moved to other stacks in order to access containers that need to leave the terminal earlier. We propose a new optimization model in which the containers can be moved in two different phases: a pre-processing and a relocation phase. To solve this problem, we develop an optimal branch-and-bound algorithm....
Article
Full-text available
The number of students with Autism Spectrum Disorder (ASD) entering Universities is growing. Recent studies show an increased understanding of students with ASD in higher education. Yet, current research generally relies on small samples, lacks information about student characteristics prior to enrollment, and does not compare students with ASD to...
Article
Full-text available
Large-scale data about learners' behavior are being generated at high speed on various online learning platforms. Knowledge Tracing (KT) is a family of machine learning sequence models that use these data to identify the likelihood of future learning performance. KT models hold great potential for the online education industry by enabling the devel...
Chapter
Online fraud poses a relatively new threat to the revenues of companies. A way to detect and prevent fraudulent behavior is with the use of specific machine learning (ML) techniques. These anomaly detection techniques have been thoroughly studied, but the level of employment is not as high. The airline industry suffers from fraud by parties such as...
Preprint
This paper introduces a general model of a single-lane roundabout, represented as a circular lattice that consists of $L$ cells, with Markovian traffic dynamics. Vehicles enter the roundabout via on-ramp queues that have stochastic arrival processes, remain on the roundabout a random number of cells, and depart via off-ramps. Importantly, the model...
Article
Full-text available
In hinterland container transportation the use of barges is getting more and more important. We propose a real‐life operational planning problem model from an inland terminal operating company, in which the number of containers shipped per barge is maximized and the number of terminals visited per barge is minimized. This problem is solved with an...
Conference Paper
Full-text available
Large-scale data about learners' behavior are being generated at high speed on various online learning platforms. Knowledge Tracing (KT) is a family of machine learning sequence models that are capable of using these data efficiently with the objective to identify the likelihood of future learning performance. This study provides an overview of KT...
Article
Full-text available
We consider a K-competing queues system with the additional feature of customer abandonment. Without abandonment, it is optimal to allocate the server to a queue according to the \(c \mu \)-rule. To derive a similar rule for the system with abandonment, we model the system as a continuous-time Markov decision process. Due to impatience, the Markov...
Article
Full-text available
In this paper, we develop a novel role for the initial function $v_0$ in the Value Iteration algorithm. In case the optimal policy of a countable state Markovian queueing control problem has a threshold or switching curve structure, we conjecture, that one can tune the choice of $v_0$ to generate monotonic sequences of $n$-stage threshold or switch...
Article
Various systems across a broad range of applications contain tandem queues. Strong dependence between the servers has proven to make such networks complicated and difficult to study. Exact analysis is rarely computationally tractable and sometimes not even possible. Nevertheless, as it is most often the case in reality, there are costs associated w...
Article
Full-text available
In the paper, the conception of Enterprise Information Portal (EIP) as an enduser interface of Simulation and Modeling System for Business (SMS-B) is presented. The system is a proposition of Business Intelligence education platform. EIP portals are also a base for Enterprise Integration Platform (EIP II) introduction in information and communicati...
Article
In life-threatening emergency situations in which every second counts, the timely arrival of an ambulance can make the difference between survival and death. In practice, the response-time targets, defined as the maximum time between the moment an incoming emergency call is received the moment when onsite medical aid is provided, are often not met....
Article
This paper develops and tests a framework for the data-driven scheduling of outbound calls made by debt collectors. These phone calls are used to persuade debtors to settle their debt, or to negotiate payment arrangements in case debtors are willing, but unable to repay. We determine on a daily basis which debtors should be called to maximize the a...
Article
Full-text available
We address the problem of ambulance dispatching, in which we must decide which ambulance to send to an incident in real time. In practice, it is commonly believed that the ‘closest idle ambulance’ rule is near-optimal and it is used throughout most literature. In this paper, we present alternatives to the classical closest idle ambulance rule. Most...
Article
Full-text available
Providers of Emergency Medical Services (EMS) are typically concerned with keeping response times short. A powerful means to ensure this, is to dynamically redistribute the ambulances over the region, depending on the current state of the system. In this paper, we provide new insight into how to optimally (re)distribute ambulances. We study the imp...
Conference Paper
Full-text available
We investigate the connectivity within different incre-mental sets of Japanese Kanji characters. Individual characters constitute the vertices in the network, components shared between them provide their edges. We find the resulting networks to have a high clustering coefficient and a low average path length, characterizing them as small worlds. We...
Article
Various types of systems across a broad range of disciplines contain tandem queues with nested sessions. Strong dependence between the servers has proved to make such networks complicated and difficult to study. Exact analysis is in most of the cases intractable. Moreover, even when performance metrics such as the saturation throughput and the util...

Network

Cited By