no code implementations • 7 Aug 2023 • Yufan Jiang, Qiaozhi He, Xiaomin Zhuang, Zhihua Wu, Kunpeng Wang, Wenlai Zhao, Guangwen Yang
Existing large language models have to run K times to generate a sequence of K tokens.
1 code implementation • 24 May 2023 • Yushu Chen, Shengzhuo Liu, Jinzhe Yang, Hao Jing, Wenlai Zhao, Guangwen Yang
In order to enhance the performance of Transformer models for long-term multivariate forecasting while minimizing computational demands, this paper introduces the Joint Time-Frequency Domain Transformer (JTFT).
1 code implementation • 4 May 2019 • Yushu Chen, Hao Jing, Wenlai Zhao, Zhi-Qiang Liu, Ouyi Li, Liang Qiao, Wei Xue, Guangwen Yang
RSG is further combined with adaptive methods to construct ARSG for acceleration.
no code implementations • 16 Mar 2019 • Jiarui Fang, Liandeng Li, Haohuan Fu, Jinlei Jiang, Wenlai Zhao, Conghui He, Xin You, Guangwen Yang
Second, we propose a set of optimization strategies for redesigning a variety of neural network layers based on Caffe.