Search Results for author: Jayden Ooi

Found 6 papers, 1 papers with code

Towards Content Provider Aware Recommender Systems: A Simulation Study on the Interplay between User and Provider Utilities

no code implementations • 6 May 2021 • Ruohan Zhan, Konstantina Christakopoulou, Ya Le, Jayden Ooi, Martin Mladenov, Alex Beutel, Craig Boutilier, Ed H. Chi, Minmin Chen

We then build a REINFORCE recommender agent, coined EcoAgent, to optimize a joint objective of user utility and the counterfactual utility lift of the provider associated with the recommended content, which we show to be equivalent to maximizing overall user utility and the utilities of all providers on the platform under some mild assumptions.

counterfactual Recommendation Systems

Paper
Add Code

Finding Fast Transformers: One-Shot Neural Architecture Search by Component Composition

no code implementations • 15 Aug 2020 • Henry Tsai, Jayden Ooi, Chun-Sung Ferng, Hyung Won Chung, Jason Riesa

Transformer-based models have achieved stateof-the-art results in many tasks in natural language processing.

Neural Architecture Search

Paper
Add Code

ConQUR: Mitigating Delusional Bias in Deep Q-learning

1 code implementation • ICML 2020 • Andy Su, Jayden Ooi, Tyler Lu, Dale Schuurmans, Craig Boutilier

Delusional bias is a fundamental source of error in approximate Q-learning.

Atari Games Q-Learning

Paper
Code

Data Efficient Training for Reinforcement Learning with Adaptive Behavior Policy Sharing

no code implementations • 12 Feb 2020 • Ge Liu, Rui Wu, Heng-Tze Cheng, Jing Wang, Jayden Ooi, Lihong Li, Ang Li, Wai Lok Sibon Li, Craig Boutilier, Ed Chi

Deep Reinforcement Learning (RL) is proven powerful for decision making in simulated environments.

Atari Games Decision Making +3

Paper
Add Code

BRPO: Batch Residual Policy Optimization

no code implementations • 8 Feb 2020 • Sungryull Sohn, Yin-Lam Chow, Jayden Ooi, Ofir Nachum, Honglak Lee, Ed Chi, Craig Boutilier

In batch reinforcement learning (RL), one often constrains a learned policy to be close to the behavior (data-generating) policy, e. g., by constraining the learned action distribution to differ from the behavior policy by some maximum degree that is the same at each state.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Advantage Amplification in Slowly Evolving Latent-State Environments

no code implementations • 29 May 2019 • Martin Mladenov, Ofer Meshi, Jayden Ooi, Dale Schuurmans, Craig Boutilier

Latent-state environments with long horizons, such as those faced by recommender systems, pose significant challenges for reinforcement learning (RL).

Recommendation Systems reinforcement-learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.