Search Results for author: Esther Derman

Found 9 papers, 3 papers with code

Tree Search-Based Policy Optimization under Stochastic Execution Delay

1 code implementation • 8 Apr 2024 • David Valensi, Esther Derman, Shie Mannor, Gal Dalal

We show that given observed delay values, it is sufficient to perform a policy search in the class of Markov policies in order to reach optimal performance, thus extending the deterministic fixed delay case.

Paper
Code

Solving Non-Rectangular Reward-Robust MDPs via Frequency Regularization

no code implementations • 3 Sep 2023 • Uri Gadot, Esther Derman, Navdeep Kumar, Maxence Mohamed Elfatihi, Kfir Levy, Shie Mannor

In robust Markov decision processes (RMDPs), it is assumed that the reward and the transition dynamics lie in a given uncertainty set.

Paper
Add Code

Twice Regularized Markov Decision Processes: The Equivalence between Robustness and Regularization

1 code implementation • 12 Mar 2023 • Esther Derman, Yevgeniy Men, Matthieu Geist, Shie Mannor

We then generalize regularized MDPs to twice regularized MDPs ($\text{R}^2$ MDPs), i. e., MDPs with $\textit{both}$ value and policy regularization.

Paper
Code

Policy Gradient for Rectangular Robust Markov Decision Processes

no code implementations • NeurIPS 2023 • Navdeep Kumar, Esther Derman, Matthieu Geist, Kfir Levy, Shie Mannor

We provide a closed-form expression for the worst occupation measure.

Policy Gradient Methods

Paper
Add Code

Twice regularized MDPs and the equivalence between robustness and regularization

no code implementations • NeurIPS 2021 • Esther Derman, Matthieu Geist, Shie Mannor

We finally generalize regularized MDPs to twice regularized MDPs (R${}^2$ MDPs), i. e., MDPs with $\textit{both}$ value and policy regularization.

Paper
Add Code

Acting in Delayed Environments with Non-Stationary Markov Policies

2 code implementations • ICLR 2021 • Esther Derman, Gal Dalal, Shie Mannor

We introduce a framework for learning and planning in MDPs where the decision-maker commits actions that are executed with a delay of $m$ steps.

Cloud Computing Q-Learning

Paper
Code

Distributional Robustness and Regularization in Reinforcement Learning

no code implementations • 5 Mar 2020 • Esther Derman, Shie Mannor

Distributionally Robust Optimization (DRO) has enabled to prove the equivalence between robustness and regularization in classification and regression, thus providing an analytical reason why regularization generalizes well in statistical learning.

Decision Making reinforcement-learning +1

Paper
Add Code

A Bayesian Approach to Robust Reinforcement Learning

no code implementations • 20 May 2019 • Esther Derman, Daniel Mankowitz, Timothy Mann, Shie Mannor

Robust Markov Decision Processes (RMDPs) intend to ensure robustness with respect to changing or adversarial system behavior.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Soft-Robust Actor-Critic Policy-Gradient

no code implementations • 11 Mar 2018 • Esther Derman, Daniel J. Mankowitz, Timothy A. Mann, Shie Mannor

It learns an optimal policy with respect to a distribution over an uncertainty set and stays robust to model uncertainty but avoids the conservativeness of robust strategies.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.