no code implementations • 16 May 2024 • Abhishek Naik, Yi Wan, Manan Tomar, Richard S. Sutton
We show that discounted methods for solving continuing reinforcement learning problems can perform significantly better if they center their rewards by subtracting out the rewards' empirical average.
no code implementations • 1 May 2024 • Yu Cui, Feng Liu, Pengbo Wang, Bohao Wang, Heng Tang, Yi Wan, Jun Wang, Jiawei Chen
Owing to their powerful semantic reasoning capabilities, Large Language Models (LLMs) have been effectively utilized as recommenders, achieving impressive performance.
no code implementations • 25 Apr 2024 • Xiang He, Weiye Song, Yiming Wang, Fabio Poiesi, Ji Yi, Manishi Desai, Quanqing Xu, Kongzheng Yang, Yi Wan
Automatic retinal layer segmentation with medical images, such as optical coherence tomography (OCT) images, serves as an important tool for diagnosing ophthalmic diseases.
no code implementations • 2 Mar 2024 • Wei Xu, Yi Wan
The attention mechanism has gained significant recognition in the field of computer vision due to its ability to effectively enhance the performance of deep neural networks.
no code implementations • 22 Dec 2023 • Huizhen Yu, Yi Wan, Richard S. Sutton
In this paper, we study asynchronous stochastic approximation algorithms without communication delays.
1 code implementation • 6 Dec 2023 • Zheqing Zhu, Rodrigo de Salvo Braz, Jalaj Bhandari, Daniel Jiang, Yi Wan, Yonathan Efroni, Liyuan Wang, Ruiyang Xu, Hongbo Guo, Alex Nikulkov, Dmytro Korenkevych, Urun Dogan, Frank Cheng, Zheng Wu, Wanqiao Xu
Reinforcement Learning (RL) offers a versatile framework for achieving long-term goals.
no code implementations • 14 Aug 2023 • Runyu Jiao, Yi Wan, Fabio Poiesi, Yiming Wang
The increasing popularity of compact and inexpensive cameras, e. g.~dash cameras, body cameras, and cameras equipped on robots, has sparked a growing interest in detecting anomalies within dynamic scenes recorded by moving cameras.
1 code implementation • 28 Jul 2023 • Youjie Zhou, Guofeng Mei, Yiming Wang, Fabio Poiesi, Yi Wan
This paper presents an investigation into the estimation of optical and scene flow using RGBD information in scenarios where the RGB modality is affected by noise or captured in dark environments.
no code implementations • 28 Mar 2023 • Yameng Wang, Yi Wan, Yongjun Zhang, Bin Zhang, Zhi Gao
The present multi-modal methods usually map high-dimensional features to low-dimensional spaces as a preprocess before feature extraction to address the nonnegligible domain gap, which inevitably leads to information loss.
no code implementations • 30 Sep 2022 • Yi Wan, Richard S. Sutton
We show two average-reward off-policy control algorithms, Differential Q-learning (Wan, Naik, & Sutton 2021a) and RVI Q-learning (Abounadi Bertsekas & Borkar 2001), converge in weakly communicating MDPs.
no code implementations • 25 May 2022 • Yi Wan, Richard S. Sutton
In a variant of the classic four-room domain, we show that 1) a higher objective value is typically associated with fewer number of elementary planning operations used by the option-value iteration algorithm to obtain a near-optimal value function, 2) our algorithm achieves an objective value that matches it achieved by two human-designed options 3) the amount of computation used by option-value iteration with options discovered by our algorithm matches it with the human-designed options, 4) the options produced by our algorithm also make intuitive sense--they seem to move to and terminate at the entrances of rooms.
1 code implementation • 25 Apr 2022 • Yi Wan, Ali Rahimi-Kalahroudi, Janarthanan Rajendran, Ida Momennejad, Sarath Chandar, Harm van Seijen
We empirically validate these insights in the case of linear function approximation by demonstrating that a modified version of linear Dyna achieves effective adaptation to local changes.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 21 Feb 2022 • Yongjun Zhang, Siyuan Zou, Xinyi Liu, Xu Huang, Yi Wan, Yongxiang Yao
Next, we propose a riverbed enhancement function to optimize the cost volume of the LiDAR projection points and their homogeneous pixels to improve the matching robustness.
no code implementations • CVPR 2022 • Dong Wei, Yi Wan, Yongjun Zhang, Xinyi Liu, Bin Zhang, Xiqi Wang
In this paper, we propose an efficient line segment reconstruction method called ELSR.
1 code implementation • 31 Oct 2021 • Youjie Zhou, Yiming Wang, Fabio Poiesi, Qi Qin, Yi Wan
We compare our L3D-based loop closure approach with recent approaches on LiDAR data and achieve state-of-the-art loop closure detection accuracy.
no code implementations • NeurIPS 2021 • Yi Wan, Abhishek Naik, Richard S. Sutton
We extend the options framework for temporal abstraction in reinforcement learning from discounted Markov decision processes (MDPs) to average-reward MDPs.
no code implementations • 17 Apr 2021 • Katya Kudashkina, Yi Wan, Abhishek Naik, Richard S. Sutton
Our algorithms and experiments are the first to treat MBRL with expectation models in a general setting.
1 code implementation • 8 Jan 2021 • Shangtong Zhang, Yi Wan, Richard S. Sutton, Shimon Whiteson
We consider off-policy policy evaluation with function approximation (FA) in average-reward MDPs, where the goal is to estimate both the reward rate and the differential value function.
no code implementations • 7 Jan 2021 • Yicheng Guo, Yujin Wen, Congwei Jiang, Yixin Lian, Yi Wan
Anomaly detection is a crucial and challenging subject that has been studied within diverse research areas.
no code implementations • 1 Jan 2021 • Kristopher De Asis, Alan Chan, Yi Wan, Richard S. Sutton
Our emphasis is on the first approach in this work, detailing an incremental policy gradient update which neither waits until the end of the episode, nor relies on learning estimates of the return.
1 code implementation • 29 Jun 2020 • Yi Wan, Abhishek Naik, Richard S. Sutton
We introduce learning and planning algorithms for average-reward MDPs, including 1) the first general proven-convergent off-policy model-free control algorithm without reference states, 2) the first proven-convergent off-policy model-free prediction algorithm, and 3) the first off-policy learning algorithm that converges to the actual value function rather than to the value function plus an offset.
no code implementations • 7 Feb 2020 • Zhimin Hou, Kuangen Zhang, Yi Wan, Dongyu Li, Chenglong Fu, Haoyong Yu
A common way to solve this problem, known as Mixture-of-Experts, is to represent the policy as the weighted sum of multiple components, where different components perform well on different parts of the state space.
no code implementations • 2 Apr 2019 • Yi Wan, Zaheer Abbas, Adam White, Martha White, Richard S. Sutton
In particular, we 1) show that planning with an expectation model is equivalent to planning with a distribution model if the state value function is linear in state features, 2) analyze two common parametrization choices for approximating the expectation: linear and non-linear expectation models, 3) propose a sound model-based policy evaluation algorithm and present its convergence results, and 4) empirically demonstrate the effectiveness of the proposed planning algorithm.
no code implementations • 14 Apr 2015 • Hao Wu, Yi Wan
In computer vision, the estimation of the fundamental matrix is a basic problem that has been extensively studied.