no code implementations • 9 Mar 2024 • Shu Liu, Asim Biswal, Audrey Cheng, Xiangxi Mo, Shiyi Cao, Joseph E. Gonzalez, Ion Stoica, Matei Zaharia
In this paper, we explore how to optimize LLM inference for analytical workloads that invoke LLMs within relational queries.