shows the learning curves with the average returns and average episode lengths for all five algorithms in the ICU-Sepsis environment. Table 3 shows the average number of episodes and time steps needed for each algorithm to converge.

shows the learning curves with the average returns and average episode lengths for all five algorithms in the ICU-Sepsis environment. Table 3 shows the average number of episodes and time steps needed for each algorithm to converge.

Source publication
Preprint
Full-text available
We present ICU-Sepsis, an environment that can be used in benchmarks for evaluating reinforcement learning (RL) algorithms. Sepsis management is a complex task that has been an important topic in applied RL research in recent years. Therefore, MDPs that model sepsis management can serve as part of a benchmark to evaluate RL algorithms on a challeng...