Search Results for author: Guoxi Zhang

Found 6 papers, 3 papers with code

INSIGHT: End-to-End Neuro-Symbolic Visual Reinforcement Learning with Language Explanations

1 code implementation • 19 Mar 2024 • Lirui Luo, Guoxi Zhang, Hongming Xu, Yaodong Yang, Cong Fang, Qing Li

Neuro-symbolic reinforcement learning (NS-RL) has emerged as a promising paradigm for explainable decision-making, characterized by the interpretability of symbolic policies.

Decision Making

Paper
Code

Online Policy Learning from Offline Preferences

no code implementations • 15 Mar 2024 • Guoxi Zhang, Han Bao, Hisashi Kashima

To address this problem, the present study introduces a framework that consolidates offline preferences and \emph{virtual preferences} for PbRL, which are comparisons between the agent's behaviors and the offline data.

Continuous Control

Paper
Add Code

Estimating Treatment Effects Under Heterogeneous Interference

1 code implementation • 25 Sep 2023 • Xiaofeng Lin, Guoxi Zhang, Xiaotian Lu, Han Bao, Koh Takeuchi, Hisashi Kashima

One popular application of this estimation lies in the prediction of the impact of a treatment (e. g., a promotion) on an outcome (e. g., sales) of a particular unit (e. g., an item), known as the individual treatment effect (ITE).

Decision Making

Paper
Code

On Modeling Long-Term User Engagement from Stochastic Feedback

no code implementations • 13 Feb 2023 • Guoxi Zhang, Xing Yao, Xuanji Xiao

An ultimate goal of recommender systems (RS) is to improve user engagement.

Reinforcement Learning (RL) Sequential Recommendation

Paper
Add Code

Behavior Estimation from Multi-Source Data for Offline Reinforcement Learning

1 code implementation • 29 Nov 2022 • Guoxi Zhang, Hisashi Kashima

To overcome this drawback, the present study proposes a latent variable model to infer a set of policies from data, which allows an agent to use as behavior policy the policy that best describes a particular trajectory.

Offline RL reinforcement-learning +1

Paper
Code

Batch Reinforcement Learning from Crowds

no code implementations • 8 Nov 2021 • Guoxi Zhang, Hisashi Kashima

This paper addresses the lack of reward in a batch reinforcement learning setting by learning a reward function from preferences.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.