Search Results for author: Mahmut T. Kandemir

Found 2 papers, 0 papers with code

GPU Cluster Scheduling for Network-Sensitive Deep Learning

no code implementations29 Jan 2024 Aakash Sharma, Vivek M. Bhasi, Sonali Singh, George Kesidis, Mahmut T. Kandemir, Chita R. Das

We propose a novel GPU-cluster scheduler for distributed DL (DDL) workloads that enables proximity based consolidation of GPU resources based on the DDL jobs' sensitivities to the anticipated communication-network delays.

Scheduling

Learn Locally, Correct Globally: A Distributed Algorithm for Training Graph Neural Networks

no code implementations ICLR 2022 Morteza Ramezani, Weilin Cong, Mehrdad Mahdavi, Mahmut T. Kandemir, Anand Sivasubramaniam

To solve the performance degradation, we propose to apply $\text{{Global Server Corrections}}$ on the server to refine the locally learned models.

Cannot find the paper you are looking for? You can Submit a new open access paper.