Search Results for author: Jonah Yi

Found 2 papers, 0 papers with code

KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference with Coupled Quantization

no code implementations7 May 2024 Tianyi Zhang, Jonah Yi, Zhaozhuo Xu, Anshumali Shrivastava

We observe that distinct channels of a key/value activation embedding are highly inter-dependent, and the joint entropy of multiple channels grows at a slower rate than the sum of their marginal entropies.

Language Modelling Large Language Model +1

CAPS: A Practical Partition Index for Filtered Similarity Search

no code implementations29 Aug 2023 Gaurav Gupta, Jonah Yi, Benjamin Coleman, Chen Luo, Vihan Lakshman, Anshumali Shrivastava

With the surging popularity of approximate near-neighbor search (ANNS), driven by advances in neural representation learning, the ability to serve queries accompanied by a set of constraints has become an area of intense interest.

Representation Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.