1 code implementation • 26 Mar 2024 • Zhen Tian, Wayne Xin Zhao, Changwang Zhang, Xin Zhao, Zhongrui Ma, Ji-Rong Wen
The core of transformer architecture lies in the self-attention mechanism, which computes the pairwise attention scores in a sequence.
no code implementations • 7 Feb 2024 • Xiaohan Yu, Li Zhang, Xin Zhao, Yue Wang, Zhongrui Ma
To address this limitation, we propose a new paradigm, ID representation, which incorporates pre-trained ID embeddings into LLMs in a complementary manner.