Search Results for author: Keisuke Kamahori

Found 1 papers, 1 papers with code

Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models

1 code implementation10 Feb 2024 Keisuke Kamahori, Yile Gu, Kan Zhu, Baris Kasikci

Large Language Models (LLMs) based on Mixture-of-Experts (MoE) architecture are showing promising performance on various tasks.

Cannot find the paper you are looking for? You can Submit a new open access paper.