no code implementations • 8 Apr 2024 • Khoi Do, Duong Nguyen, Nguyen H. Tran, Viet Dung Nguyen
First, the class-wise gradient magnitude homogenization helps alleviate the imbalance among label masks by ensuring equal consideration of the class-wise impact on model updates.
no code implementations • 25 Sep 2023 • Khoi Do, Duong Nguyen, Hoa Nguyen, Long Tran-Thanh, Nguyen-Hoang Tran, Quoc-Viet Pham
This paper explores Large Batch Training techniques using layer-wise adaptive scaling ratio (LARS) across diverse settings, uncovering insights.