1 code implementation • 10 Feb 2024 • Keisuke Kamahori, Yile Gu, Kan Zhu, Baris Kasikci
Large Language Models (LLMs) based on Mixture-of-Experts (MoE) architecture are showing promising performance on various tasks.