1 code implementation • 16 May 2024 • Rhea Sanjay Sukthanker, Arber Zela, Benedikt Staffler, Jorg K. H. Franke, Frank Hutter
To this end, we propose HW-GPT-Bench, a hardware-aware language model surrogate benchmark, where we leverage weight-sharing techniques from Neural Architecture Search (NAS) to efficiently train a supernet proxy, encompassing language models of varying scales in a single model.