Search Results for author: Tianshi Xu

Found 4 papers, 0 papers with code

FastQuery: Communication-efficient Embedding Table Query for Private LLM Inference

no code implementations25 May 2024 Chenqi Lin, Tianshi Xu, Zebin Yang, Runsheng Wang, Ru Huang, Meng Li

We observe the overhead mainly comes from the neglect of 1) the one-hot nature of user queries and 2) the robustness of the embedding table to low bit-width quantization noise.

Quantization

PrivCirNet: Efficient Private Inference via Block Circulant Transformation

no code implementations23 May 2024 Tianshi Xu, Lemeng Wu, Runsheng Wang, Meng Li

Homomorphic encryption (HE)-based deep neural network (DNN) inference protects data and model privacy but suffers from significant computation overhead.

HEQuant: Marrying Homomorphic Encryption and Quantization for Communication-Efficient Private Inference

no code implementations29 Jan 2024 Tianshi Xu, Meng Li, Runsheng Wang

Compared with prior-art HE-based protocols, e. g., CrypTFlow2, Cheetah, Iron, etc, HEQuant achieves $3. 5\sim 23. 4\times$ communication reduction and $3. 0\sim 9. 3\times$ latency reduction.

Quantization

Falcon: Accelerating Homomorphically Encrypted Convolutions for Efficient Private Mobile Network Inference

no code implementations25 Aug 2023 Tianshi Xu, Meng Li, Runsheng Wang, Ru Huang

Efficient networks, e. g., MobileNetV2, EfficientNet, etc, achieves state-of-the-art (SOTA) accuracy with lightweight computation.

Cannot find the paper you are looking for? You can Submit a new open access paper.