Search Results for author: Wen-Pu Cai

LCQ: Low-Rank Codebook based Quantization for Large Language Models

Weight quantization has been widely used for model compression, which can reduce both storage and computational cost.

Paper
Add Code

WNQ adopts weight normalization to avoid the long-tail distribution of network weights and subsequently reduces the quantization error.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.