Fig 4 - uploaded by Luchen Li
Content may be subject to copyright.
Distributions of returns vs. action deviations. (Left) distributions of returns for different levels of average absolute vasopressor deviations between clinicians and proposed policy per time step. The uppermost subplot shows empirical outcomes from patients whose vasopressors actually received deviated per time step less than 1 3 of overall vasopressor deviations (ascending) in the test set, and the lowermost subplot higher than 2 3 ; (Right) intravenous fluid counterparts. 

Distributions of returns vs. action deviations. (Left) distributions of returns for different levels of average absolute vasopressor deviations between clinicians and proposed policy per time step. The uppermost subplot shows empirical outcomes from patients whose vasopressors actually received deviated per time step less than 1 3 of overall vasopressor deviations (ascending) in the test set, and the lowermost subplot higher than 2 3 ; (Right) intravenous fluid counterparts. 

Source publication
Preprint
Full-text available
Off-policy reinforcement learning enables near-optimal policy from suboptimal experience, thereby provisions opportunity for artificial intelligence applications in healthcare. Previous works have mainly framed patient-clinician interactions as Markov decision processes, while true physiological states are not necessarily fully observable from clin...

Context in source publication

Context 1
... of using OPE to provide theoretical policy evaluation, we focus on empirically evaluating our learned policy by comparing how the similarity between clinicians' decisions and our suggestions indicates patient outcomes: this provides an empirical validation and is commonly adopted [8,11] for medical scenerios involving retrospective dataset. Fig. 4 shows probability mass functions (histograms) of returns of start states (i.e. γ T −1 r T −1 , T being the length of that time series) in test set divided into three mutually exclusive groups according to the average (per time step) absolute deviation from clinicians' decision and the proposed dose in terms of vasopressor or ...