no code implementations • 24 May 2024 • Penghui Qi, Xinyi Wan, Nyamdavaa Amar, Min Lin
Our evaluations demonstrate that in pure pipeline parallelism settings, our methods outperform 1F1B by from 7% to 55% in terms of throughput.
1 code implementation • 30 Nov 2023 • Penghui Qi, Xinyi Wan, Guangxing Huang, Min Lin
Pipeline parallelism is one of the key components for large-scale distributed training, yet its efficiency suffers from pipeline bubbles which were deemed inevitable.