1 code implementation • 12 Mar 2024 • Chengxing Jia, Fuxiang Zhang, Yi-Chen Li, Chen-Xiao Gao, Xu-Hui Liu, Lei Yuan, Zongzhang Zhang, Yang Yu
Specifically, the objective of adversarial data augmentation is not merely to generate data analogous to offline data distribution; instead, it aims to create adversarial examples designed to confound learned task representations and lead to incorrect task identification.
no code implementations • 17 Feb 2024 • Xinyu Zhang, Wenjie Qiu, Yi-Chen Li, Lei Yuan, Chengxing Jia, Zongzhang Zhang, Yang Yu
DORA incorporates an information bottleneck principle that maximizes mutual information between the dynamics encoding and the environmental data, while minimizing mutual information between the dynamics encoding and the actions of the behavior policy.
no code implementations • 1 Nov 2023 • Cong Guan, Lichao Zhang, Chunpeng Fan, Yichen Li, Feng Chen, Lihe Li, Yunjia Tian, Lei Yuan, Yang Yu
Developing intelligent agents capable of seamless coordination with humans is a critical step towards achieving artificial general intelligence.
1 code implementation • 10 May 2023 • Lei Yuan, Zi-Qian Zhang, Ke Xue, Hao Yin, Feng Chen, Cong Guan, Li-He Li, Chao Qian, Yang Yu
Concretely, to avoid the ego-system overfitting to a specific attacker, we maintain a set of attackers, which is optimized to guarantee the attackers high attacking quality and behavior diversity.
no code implementations • 9 May 2023 • Lei Yuan, Feng Chen, Zhongzhang Zhang, Yang Yu
In specific, we introduce a novel message-attacking approach that models the learning of the auxiliary attacker as a cooperative problem under a shared goal to minimize the coordination ability of the ego system, with which every information channel may suffer from distinct message attacks.
no code implementations • 7 May 2023 • Lei Yuan, Lihe Li, Ziqian Zhang, Fuxiang Zhang, Cong Guan, Yang Yu
Towards tackling the mentioned issue, this paper proposes an approach Multi-Agent Continual Coordination via Progressive Task Contextualization, dubbed MACPro.
no code implementations • 7 May 2023 • Lei Yuan, Tao Jiang, Lihe Li, Feng Chen, Zongzhang Zhang, Yang Yu
Many multi-agent scenarios require message sharing among agents to promote coordination, hastening the robustness of multi-agent communication when policies are deployed in a message perturbation environment.
no code implementations • 19 Feb 2023 • Cong Guan, Feng Chen, Lei Yuan, Zongzhang Zhang, Yang Yu
We also release the built offline benchmarks in this paper as a testbed for communication ability validation to facilitate further future research.
1 code implementation • 5 Jan 2023 • Shaowei Zhang, Jiahan Cao, Lei Yuan, Yang Yu, De-Chuan Zhan
In cooperative multi-agent reinforcement learning (CMARL), it is critical for agents to achieve a balance between self-exploration and team collaboration.
1 code implementation • 13 Oct 2022 • Ke Xue, Jiacheng Xu, Lei Yuan, Miqing Li, Chao Qian, Zongzhang Zhang, Yang Yu
MA-DAC formulates the dynamic configuration of a complex algorithm with multiple types of hyperparameters as a contextual multi-agent Markov decision process and solves it by a cooperative multi-agent RL (MARL) algorithm.
no code implementations • 9 Aug 2022 • Ke Xue, Yutong Wang, Cong Guan, Lei Yuan, Haobo Fu, Qiang Fu, Chao Qian, Yang Yu
Generating agents that can achieve zero-shot coordination (ZSC) with unseen partners is a new challenge in cooperative multi-agent reinforcement learning (MARL).
1 code implementation • 1 Jun 2022 • Yi Guo, Zhaocheng Liu, Jianchao Tan, Chao Liao, Sen yang, Lei Yuan, Dongying Kong, Zhi Chen, Ji Liu
When training is finished, some gates are exact zero, while others are around one, which is particularly favored by the practical hot-start training in the industry, due to no damage to the model performance before and after removing the features corresponding to exact-zero gates.
no code implementations • 1 Jun 2022 • Chengxing Jia, Hao Yin, Chenxiao Gao, Tian Xu, Lei Yuan, Zongzhang Zhang, Yang Yu
Model-based offline optimization with dynamics-aware policy provides a new perspective for policy learning and out-of-distribution generalization, where the learned policy could adapt to different dynamics enumerated at the training stage.
1 code implementation • ACL 2022 • Renyu Zhu, Lei Yuan, Xiang Li, Ming Gao, Wenyuan Cai
In this paper, we consider human behaviors and propose the PGNN-EK model that consists of two main components.
no code implementations • 9 Mar 2022 • Rongjun Qin, Feng Chen, Tonghan Wang, Lei Yuan, Xiaoran Wu, Zongzhang Zhang, Chongjie Zhang, Yang Yu
We demonstrate that the task representation can capture the relationship among tasks, and can generalize to unseen tasks.
1 code implementation • 10 Nov 2021 • Xiangru Lian, Binhang Yuan, XueFeng Zhu, Yulong Wang, Yongjun He, Honghuan Wu, Lei Sun, Haodong Lyu, Chengjun Liu, Xing Dong, Yiqiao Liao, Mingnan Luo, Congfei Zhang, Jingru Xie, Haonan Li, Lei Chen, Renjie Huang, Jianying Lin, Chengchun Shu, Xuezhong Qiu, Zhishan Liu, Dongying Kong, Lei Yuan, Hai Yu, Sen yang, Ce Zhang, Ji Liu
Specifically, in order to ensure both the training efficiency and the training accuracy, we design a novel hybrid training algorithm, where the embedding layer and the dense neural network are handled by different synchronization mechanisms; then we build a system called Persia (short for parallel recommendation training system with hybrid acceleration) to support this hybrid training algorithm.
no code implementations • 26 Sep 2021 • Jiahan Cao, Lei Yuan, Jianhao Wang, Shaowei Zhang, Chongjie Zhang, Yang Yu, De-Chuan Zhan
During long-time observations, agents can build \textit{awareness} for teammates to alleviate the problem of partial observability.
no code implementations • 20 Aug 2021 • Weicong Ding, Hanlin Tang, Jingshuo Feng, Lei Yuan, Sen yang, Guangxu Yang, Jie Zheng, Jing Wang, Qiang Su, Dong Zheng, Xuezhong Qiu, Yongqi Liu, Yuxuan Chen, Yang Liu, Chao Song, Dongying Kong, Kai Ren, Peng Jiang, Qiao Lian, Ji Liu
In this setting with multiple and constrained goals, this paper discovers that a probabilistic strategic parameter regime can achieve better value compared to the standard regime of finding a single deterministic parameter.
3 code implementations • ICLR 2021 • Daochen Zha, Wenye Ma, Lei Yuan, Xia Hu, Ji Liu
Unfortunately, methods based on intrinsic rewards often fall short in procedurally-generated environments, where a different environment is generated in each episode so that the agent is not likely to visit the same state more than once.
no code implementations • 23 Oct 2020 • Yunjie Zhang, Fei Tao, Xudong Liu, Runze Su, Xiaorong Mei, Weicong Ding, Zhichen Zhao, Lei Yuan, Ji Liu
In this paper, we proposed a novel end-to-end self-organizing framework for user behavior prediction.
no code implementations • 14 Sep 2020 • Runze Su, Fei Tao, Xudong Liu, Hao-Ran Wei, Xiaorong Mei, Zhiyao Duan, Lei Yuan, Ji Liu, Yuying Xie
The applications of short-term user-generated video (UGV), such as Snapchat, and Youtube short-term videos, booms recently, raising lots of multimodal machine learning tasks.
no code implementations • 17 Jul 2019 • Hanlin Tang, Xiangru Lian, Shuang Qiu, Lei Yuan, Ce Zhang, Tong Zhang, Ji Liu
Since the \emph{decentralized} training has been witnessed to be superior to the traditional \emph{centralized} training in the communication restricted scenario, therefore a natural question to ask is "how to apply the error-compensated technology to the decentralized learning to further reduce the communication cost."
no code implementations • 30 Apr 2013 • Ji Liu, Lei Yuan, Jieping Ye
Specifically, we show 1) in the noiseless case, if the condition number of $D$ is bounded and the measurement number $n\geq \Omega(s\log(p))$ where $s$ is the sparsity number, then the true solution can be recovered with high probability; and 2) in the noisy case, if the condition number of $D$ is bounded and the measurement increases faster than $s\log(p)$, that is, $s\log(p)=o(n)$, the estimate error converges to zero with probability 1 when $p$ and $s$ go to infinity.
no code implementations • NeurIPS 2011 • Lei Yuan, Jun Liu, Jieping Ye
There have been several recent attempts to study a more general formulation, where groups of features are given, potentially with overlaps between the groups.