Paper tables with annotated results for Conditional Importance Sampling for Off-Policy Learning

Paper

Conditional Importance Sampling for Off-Policy Learning

The principal contribution of this paper is a conceptual framework for off-policy reinforcement learning, based on conditional expectations of importance sampling ratios. This framework yields new perspectives and understanding of existing off-policy algorithms, and reveals a broad space of unexplored algorithms. We theoretically analyse this space, and concretely investigate several algorithms that arise from this framework.

PDF Paper record

Results in Papers With Code

(↓ scroll down to see all results)

Conditional Importance Sampling for Off-Policy Learning

Reader Guidelines

Editor Guidelines