Search Results for author: Tianshi Xu

Found 4 papers, 0 papers with code

FastQuery: Communication-efficient Embedding Table Query for Private LLM Inference

no code implementations • 25 May 2024 • Chenqi Lin, Tianshi Xu, Zebin Yang, Runsheng Wang, Ru Huang, Meng Li

We observe the overhead mainly comes from the neglect of 1) the one-hot nature of user queries and 2) the robustness of the embedding table to low bit-width quantization noise.

Quantization

Paper
Add Code

PrivCirNet: Efficient Private Inference via Block Circulant Transformation

no code implementations • 23 May 2024 • Tianshi Xu, Lemeng Wu, Runsheng Wang, Meng Li

Homomorphic encryption (HE)-based deep neural network (DNN) inference protects data and model privacy but suffers from significant computation overhead.

Paper
Add Code

HEQuant: Marrying Homomorphic Encryption and Quantization for Communication-Efficient Private Inference

no code implementations • 29 Jan 2024 • Tianshi Xu, Meng Li, Runsheng Wang

Compared with prior-art HE-based protocols, e. g., CrypTFlow2, Cheetah, Iron, etc, HEQuant achieves $3. 5\sim 23. 4\times$ communication reduction and $3. 0\sim 9. 3\times$ latency reduction.

Quantization

Paper
Add Code

Falcon: Accelerating Homomorphically Encrypted Convolutions for Efficient Private Mobile Network Inference

no code implementations • 25 Aug 2023 • Tianshi Xu, Meng Li, Runsheng Wang, Ru Huang

Efficient networks, e. g., MobileNetV2, EfficientNet, etc, achieves state-of-the-art (SOTA) accuracy with lightweight computation.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.