no code implementations • 6 May 2024 • Bo Liu, Shanshan Qin, Venkatesh Murthy, Yuhai Tu
We next compared the alignment performance of local Hebbian rule and the global stochastic-gradient-descent (SGD) learning for artificial neural networks.
no code implementations • 25 Jan 2024 • Haochen Fu, Chenyi Fei, Qi Ouyang, Yuhai Tu
From a general timescale invariance, we show that TC relies on existence of certain period-lengthening reactions wherein the period of the system increases strongly with the rates in these reactions.
no code implementations • 1 Jun 2023 • Adrian Shuai Li, Elisa Bertino, Xuan-Hong Dang, Ankush Singla, Yuhai Tu, Mark N Wegman
We show that information useful only in the source can be present in the DIRep, weakening the quality of the domain adaptation.
no code implementations • 8 Dec 2022 • Steven Durr, Youssef Mroueh, Yuhai Tu, Shenshen Wang
Generative adversarial networks (GANs) are a class of machine-learning models that use adversarial training to generate new samples with the same (potentially very complex) statistics as the training samples.
no code implementations • 2 Jun 2022 • Ning Yang, Chao Tang, Yuhai Tu
Empirical studies showed a strong correlation between flatness of the loss landscape at a solution and its generalizability, and stochastic gradient descent (SGD) is crucial in finding the flat solutions.
1 code implementation • 21 Mar 2022 • Yu Feng, Yuhai Tu
The contribution from a given eigen-direction is the product of two geometric factors (determinants): the sharpness of the loss landscape and the standard deviation of the dual weights, which is found to scale with the weight norm of the solution.
no code implementations • 2 Dec 2021 • Wei zhang, Mingrui Liu, Yu Feng, Xiaodong Cui, Brian Kingsbury, Yuhai Tu
We conduct extensive studies over 18 state-of-the-art DL models/tasks and demonstrate that DPSGD often converges in cases where SSGD diverges for large learning rates in the large batch setting.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 16 Jan 2021 • Yu Feng, Yuhai Tu
Without mislabeled data, we find that the SGD learning dynamics transitions from a fast learning phase to a slow exploration phase, which is associated with large changes in order parameters that characterize the alignment of SGD gradients and their mean amplitude.
no code implementations • 1 Jan 2021 • Wei zhang, Mingrui Liu, Yu Feng, Brian Kingsbury, Yuhai Tu
We conduct extensive studies over 12 state-of-the-art DL models/tasks and demonstrate that DPSGD consistently outperforms SSGD in the large batch setting; and DPSGD converges in cases where SSGD diverges for large learning rates.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 6 Jan 2020 • Yu Feng, Yuhai Tu
Despite the tremendous success of Stochastic Gradient Descent (SGD) algorithm in deep learning, little is known about how SGD finds generalizable solutions in the high-dimensional weight space.
no code implementations • 30 May 2019 • Davide Chiuchiú, Yuhai Tu, Simone Pigolotti
In this paper, we study fluctuations of error and speed in biopolymer synthesis and show that they are in general correlated.
no code implementations • 19 Apr 2019 • Pouya Bashivan, Martin Schrimpf, Robert Ajemian, Irina Rish, Matthew Riemer, Yuhai Tu
Most previous approaches to this problem rely on memory replay buffers which store samples from previously learned tasks, and use them to regularize the learning on new ones.
2 code implementations • ICLR 2019 • Matthew Riemer, Ignacio Cases, Robert Ajemian, Miao Liu, Irina Rish, Yuhai Tu, Gerald Tesauro
In this work we propose a new conceptualization of the continual learning problem in terms of a temporally symmetric trade-off between transfer and interference that can be optimized by enforcing gradient alignment across examples.
no code implementations • 23 Jul 2018 • Jingxiang Shen, Mariela D. Petkova, Yuhai Tu, Feng Liu, Chao Tang
Furthermore, the regulation network it uncovers is strikingly similar to the one inferred from experiments.