Search Results for author: Luchen Li

Found 5 papers, 0 papers with code

Bag of Policies for Distributional Deep Exploration

no code implementations3 Aug 2023 Asen Nachkov, Luchen Li, Giulia Luise, Filippo Valdettaro, Aldo Faisal

To test whether optimistic ensemble method can improve on distributional RL as did on scalar RL, by e. g. Bootstrapped DQN, we implement the BoP approach with a population of distributional actor-critics using Bayesian Distributional Policy Gradients (BDPG).

Atari Games Efficient Exploration +2

Bayesian Distributional Policy Gradients

no code implementations20 Mar 2021 Luchen Li, A. Aldo Faisal

Distributional Reinforcement Learning (RL) maintains the entire probability distribution of the reward-to-go, i. e. the return, providing more learning signals that account for the uncertainty associated with policy performance, which may be beneficial for trading off exploration and exploitation and policy learning in general.

Atari Games Contrastive Learning +2

Optimizing Medical Treatment for Sepsis in Intensive Care: from Reinforcement Learning to Pre-Trial Evaluation

no code implementations13 Mar 2020 Luchen Li, Ignacio Albert-Smet, Aldo A. Faisal

Our aim is to establish a framework where reinforcement learning (RL) of optimizing interventions retrospectively allows us a regulatory compliant pathway to prospective clinical testing of the learned policies in a clinical deployment.

Reinforcement Learning (RL)

Optimizing Sequential Medical Treatments with Auto-Encoding Heuristic Search in POMDPs

no code implementations17 May 2019 Luchen Li, Matthieu Komorowski, Aldo A. Faisal

Health-related data is noisy and stochastic in implying the true physiological states of patients, limiting information contained in single-moment observations for sequential clinical decision making.

Decision Making

The Actor Search Tree Critic (ASTC) for Off-Policy POMDP Learning in Medical Decision Making

no code implementations29 May 2018 Luchen Li, Matthieu Komorowski, Aldo A. Faisal

We capture this situation with partially observable Markov decision process, in which an agent optimises its actions in a belief represented as a distribution of patient states inferred from individual history trajectories.

Decision Making

Cannot find the paper you are looking for? You can Submit a new open access paper.