Figure 2 - available via license: Creative Commons Attribution 4.0 International
Content may be subject to copyright.
shows the learning curves with the average returns and average episode lengths for all five algorithms in the ICU-Sepsis environment. Table 3 shows the average number of episodes and time steps needed for each algorithm to converge.
Source publication
We present ICU-Sepsis, an environment that can be used in benchmarks for evaluating reinforcement learning (RL) algorithms. Sepsis management is a complex task that has been an important topic in applied RL research in recent years. Therefore, MDPs that model sepsis management can serve as part of a benchmark to evaluate RL algorithms on a challeng...