Search Results for author: Yeonju Ro

Found 4 papers, 0 papers with code

FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed Forward Skipping

no code implementations • 5 Apr 2024 • Ajay Jaiswal, Bodun Hu, Lu Yin, Yeonju Ro, Shiwei Liu, Tianlong Chen, Aditya Akella

In this work, we observed the saturation of computationally expensive feed-forward blocks of LLM layers and proposed FFN-SkipLLM, which is a novel fine-grained skip strategy of autoregressive LLMs.

Attribute Hallucination +1

Paper
Add Code

Mr.BiQ: Post-Training Non-Uniform Quantization Based on Minimizing the Reconstruction Error

no code implementations • CVPR 2022 • Yongkweon Jeon, Chungman Lee, Eulrang Cho, Yeonju Ro

We thus propose a new post-training non-uniform quantization method, called Mr. BiQ, allowing low bit-width quantization even on Transformer models.

Binarization Quantization

Paper
Add Code

Q-Rater: Non-Convex Optimization for Post-Training Uniform Quantization

no code implementations • 5 May 2021 • Byeongwook Kim, Dongsoo Lee, Yeonju Ro, Yongkweon Jeon, Se Jung Kwon, Baeseong Park, Daehwan Oh

When the number of quantization bits is relatively low, however, non-convex optimization is unavoidable to improve model accuracy.

Quantization

Paper
Add Code

Post-Training Weighted Quantization of Neural Networks for Language Models

no code implementations • 1 Jan 2021 • Se Jung Kwon, Dongsoo Lee, Yongkweon Jeon, Byeongwook Kim, Bae Seong Park, Yeonju Ro

As a practical model compression technique, parameter quantization is effective especially for language models associated with a large memory footprint.

Model Compression Quantization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.