Search Results for author: Noam Razin

Found 10 papers, 8 papers with code

Implicit Bias of Policy Gradient in Linear Quadratic Control: Extrapolation to Unseen Initial States

1 code implementation • 12 Feb 2024 • Noam Razin, Yotam Alexander, Edo Cohen-Karlik, Raja Giryes, Amir Globerson, Nadav Cohen

This paper theoretically studies the implicit bias of policy gradient in terms of extrapolation to unseen initial states.

Paper
Code

Vanishing Gradients in Reinforcement Finetuning of Language Models

1 code implementation • 31 Oct 2023 • Noam Razin, Hattie Zhou, Omid Saremi, Vimal Thilak, Arwen Bradley, Preetum Nakkiran, Joshua Susskind, Etai Littwin

Pretrained language models are commonly aligned with human preferences and downstream tasks via reinforcement finetuning (RFT), which refers to maximizing a (possibly learned) reward function using policy gradient algorithms.

Paper
Code

What Algorithms can Transformers Learn? A Study in Length Generalization

no code implementations • 24 Oct 2023 • Hattie Zhou, Arwen Bradley, Etai Littwin, Noam Razin, Omid Saremi, Josh Susskind, Samy Bengio, Preetum Nakkiran

Large language models exhibit surprising emergent generalization properties, yet also struggle on many simple reasoning tasks such as arithmetic and parity.

Paper
Add Code

What Makes Data Suitable for a Locally Connected Neural Network? A Necessary and Sufficient Condition Based on Quantum Entanglement

1 code implementation • 20 Mar 2023 • Yotam Alexander, Nimrod De La Vega, Noam Razin, Nadav Cohen

Focusing on locally connected neural networks (a prevalent family of architectures that includes convolutional and recurrent neural networks as well as local self-attention models), we address this problem by adopting theoretical tools from quantum physics.

Paper
Code

On the Ability of Graph Neural Networks to Model Interactions Between Vertices

1 code implementation • NeurIPS 2023 • Noam Razin, Tom Verbin, Nadav Cohen

Formalizing strength of interactions through an established measure known as separation rank, we quantify the ability of certain GNNs to model interaction between a given subset of vertices and its complement, i. e. between the sides of a given partition of input vertices.

Paper
Code

Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

1 code implementation • 27 Jan 2022 • Noam Razin, Asaf Maman, Nadav Cohen

In the pursuit of explaining implicit regularization in deep learning, prominent focus was given to matrix and tensor factorizations, which correspond to simplified neural networks.

Paper
Code

Implicit Regularization in Tensor Factorization

1 code implementation • 19 Feb 2021 • Noam Razin, Asaf Maman, Nadav Cohen

Recent efforts to unravel the mystery of implicit regularization in deep learning have led to a theoretical focus on matrix factorization -- matrix completion via linear neural network.

Matrix Completion

Paper
Code

RecoBERT: A Catalog Language Model for Text-Based Recommendations

no code implementations • Findings of the Association for Computational Linguistics 2020 • Itzik Malkiel, Oren Barkan, Avi Caciularu, Noam Razin, Ori Katz, Noam Koenigstein

In addition, we introduce a new language understanding task for wine recommendations using similarities based on professional wine reviews.

Language Modelling

Paper
Add Code

Implicit Regularization in Deep Learning May Not Be Explainable by Norms

1 code implementation • NeurIPS 2020 • Noam Razin, Nadav Cohen

Mathematically characterizing the implicit regularization induced by gradient-based optimization is a longstanding pursuit in the theory of deep learning.

Matrix Completion Open-Ended Question Answering

Paper
Code

Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding

1 code implementation • 14 Aug 2019 • Oren Barkan, Noam Razin, Itzik Malkiel, Ori Katz, Avi Caciularu, Noam Koenigstein

In this paper, we introduce Distilled Sentence Embedding (DSE) - a model that is based on knowledge distillation from cross-attentive models, focusing on sentence-pair tasks.

Knowledge Distillation Natural Language Understanding +4

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.