Search Results for author: Mahmut T. Kandemir

Found 2 papers, 0 papers with code

GPU Cluster Scheduling for Network-Sensitive Deep Learning

no code implementations • 29 Jan 2024 • Aakash Sharma, Vivek M. Bhasi, Sonali Singh, George Kesidis, Mahmut T. Kandemir, Chita R. Das

We propose a novel GPU-cluster scheduler for distributed DL (DDL) workloads that enables proximity based consolidation of GPU resources based on the DDL jobs' sensitivities to the anticipated communication-network delays.

Scheduling

Paper
Add Code

Learn Locally, Correct Globally: A Distributed Algorithm for Training Graph Neural Networks

no code implementations • ICLR 2022 • Morteza Ramezani, Weilin Cong, Mehrdad Mahdavi, Mahmut T. Kandemir, Anand Sivasubramaniam

To solve the performance degradation, we propose to apply $\text{{Global Server Corrections}}$ on the server to refine the locally learned models.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.