Search Results for author: Jaeyoung Cha

Found 2 papers, 2 papers with code

Position Coupling: Leveraging Task Structure for Improved Length Generalization of Transformers

1 code implementation31 May 2024 Hanseul Cho, Jaeyoung Cha, Pranjal Awasthi, Srinadh Bhojanapalli, Anupam Gupta, Chulhee Yun

Even for simple arithmetic tasks like integer addition, it is challenging for Transformers to generalize to longer sequences than those encountered during training.

Decoder Position

Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond

1 code implementation13 Mar 2023 Jaeyoung Cha, Jaewook Lee, Chulhee Yun

We study convergence lower bounds of without-replacement stochastic gradient descent (SGD) for solving smooth (strongly-)convex finite-sum minimization problems.

Cannot find the paper you are looking for? You can Submit a new open access paper.