Search Results for author: Xinchen Jin

Found 1 papers, 1 papers with code

Computron: Serving Distributed Deep Learning Models with Model Parallel Swapping

1 code implementation • 24 Jun 2023 • Daniel Zou, Xinchen Jin, Xueyang Yu, Hao Zhang, James Demmel

In anticipation of workloads that involve serving many of such large models to handle different tasks, we develop Computron, a system that uses memory swapping to serve multiple distributed models on a shared GPU cluster.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.