Search Results for author: Yuhsun Huang

Found 1 papers, 1 papers with code

Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding

1 code implementation19 Feb 2024 Zhuoming Chen, Avner May, Ruslan Svirschevski, Yuhsun Huang, Max Ryabinin, Zhihao Jia, Beidi Chen

This paper introduces Sequoia, a scalable, robust, and hardware-aware algorithm for speculative decoding.

Cannot find the paper you are looking for? You can Submit a new open access paper.