no code implementations • ICML 2020 • Fangcheng Fu, Yuzheng Hu, Yihan He, Jiawei Jiang, Yingxia Shao, Ce Zhang, Bin Cui
Recent years have witnessed intensive research interests on training deep neural networks (DNNs) more efficiently by quantization-based compression methods, which facilitate DNNs training in two ways: (1) activations are quantized to shrink the memory consumption, and (2) gradients are quantized to decrease the communication cost.
no code implementations • 27 Sep 2023 • Yuhang Liu, Boyi Sun, Yuke Li, Yuzheng Hu, Fei-Yue Wang
It uses a graph-attention Transformer to extract domain-specific features for each agent, coupled with a cross-attention mechanism for the final fusion.
no code implementations • 5 Jul 2023 • Yuzheng Hu, Fan Wu, Qinbin Li, Yunhui Long, Gonzalo Munilla Garrido, Chang Ge, Bolin Ding, David Forsyth, Bo Li, Dawn Song
As the prevalence of data analysis grows, safeguarding data privacy has become a paramount concern.
1 code implementation • 28 Nov 2022 • Yuzheng Hu, Fan Wu, Hongyang Zhang, Han Zhao
More specifically, we demonstrate that while the constraint of adversarial robustness consistently degrades the standard accuracy in the balanced class setting, the class imbalance ratio plays a fundamentally different role in accuracy disparity compared to the Gaussian case, due to the heavy tail of the stable distribution.
no code implementations • 19 Jul 2022 • Yuzheng Hu, Tianle Cai, Jinyong Shan, Shange Tang, Chaochao Cai, Ethan Song, Bo Li, Dawn Song
We provide a comprehensive and rigorous privacy analysis of VLR in a class of open-source Federated Learning frameworks, where the protocols might differ between one another, yet a procedure of obtaining local gradients is implicitly shared.
no code implementations • ICLR 2022 • Yuzheng Hu, Ziwei Ji, Matus Telgarsky
We show that the simplest actor-critic method -- a linear softmax policy updated with TD through interaction with a linear MDP, but featuring no explicit regularization or exploration -- does not merely find an optimal policy, but moreover prefers high entropy optimal policies.
1 code implementation • ICLR 2022 • Muthu Chidambaram, Xiang Wang, Yuzheng Hu, Chenwei Wu, Rong Ge
Despite seeing very few true data points during training, models trained using Mixup seem to still minimize the original empirical risk and exhibit better generalization and robustness on various tasks when compared to standard training.
1 code implementation • 20 Dec 2019 • Yuzheng Hu, Licong Lin, Shange Tang
To the best of our knowledge, this is the first paper that seriously considers the necessity of square root among all adaptive methods.