Search Results for author: Yossi Arjevani

Found 22 papers, 0 papers with code

Hidden Minima in Two-Layer ReLU Networks

no code implementations • 28 Dec 2023 • Yossi Arjevani

The theoretical results, stated and proved for o-minimal structures, show that the set comprising all tangency arcs is topologically sufficiently tame to enable a numerical construction of tangency arcs and so compare how minima, both types, are positioned relative to adjacent critical points.

Paper
Add Code

Symmetry & Critical Points for Symmetric Tensor Decomposition Problems

no code implementations • 13 Jun 2023 • Yossi Arjevani, Gal Vinograd

Use is made of the rich symmetry structure to construct infinite families of critical points represented by Puiseux series in the problem dimension, and so obtain precise analytic estimates on the value of the objective function and the Hessian spectrum.

LEMMA Tensor Decomposition

Paper
Add Code

Annihilation of Spurious Minima in Two-Layer ReLU Networks

no code implementations • 12 Oct 2022 • Yossi Arjevani, Michael Field

We study the optimization problem associated with fitting two-layer ReLU neural networks with respect to the squared loss, where labels are generated by a target network.

Vocal Bursts Valence Prediction

Paper
Add Code

Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II

no code implementations • NeurIPS 2021 • Yossi Arjevani, Michael Field

In particular, we derive analytic estimates for the loss at different minima, and prove that modulo $O(d^{-1/2})$-terms the Hessian spectrum concentrates near small positive constants, with the exception of $\Theta(d)$ eigenvalues which grow linearly with~$d$.

Paper
Add Code

Equivariant bifurcation, quadratic equivariants, and symmetry breaking for the standard representation of $S_n$

no code implementations • 6 Jul 2021 • Yossi Arjevani, Michael Field

Motivated by questions originating from the study of a class of shallow student-teacher neural networks, methods are developed for the analysis of spurious minima in classes of gradient equivariant dynamics related to neural nets.

Paper
Add Code

Symmetry Breaking in Symmetric Tensor Decomposition

no code implementations • 10 Mar 2021 • Yossi Arjevani, Joan Bruna, Michael Field, Joe Kileel, Matthew Trager, Francis Williams

In this note, we consider the highly nonconvex optimization problem associated with computing the rank decomposition of symmetric tensors.

Tensor Decomposition

Paper
Add Code

Analytic Characterization of the Hessian in Shallow ReLU Models: A Tale of Symmetry

no code implementations • NeurIPS 2020 • Yossi Arjevani, Michael Field

We consider the optimization problem associated with fitting two-layers ReLU networks with respect to the squared loss, where labels are generated by a target network.

Paper
Add Code

Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations

no code implementations • 24 Jun 2020 • Yossi Arjevani, Yair Carmon, John C. Duchi, Dylan J. Foster, Ayush Sekhari, Karthik Sridharan

We design an algorithm which finds an $\epsilon$-approximate stationary point (with $\|\nabla F(x)\|\le \epsilon$) using $O(\epsilon^{-3})$ stochastic gradient and Hessian-vector products, matching guarantees that were previously available only under a stronger assumption of access to multiple queries with the same random seed.

Second-order methods Stochastic Optimization

Paper
Add Code

IDEAL: Inexact DEcentralized Accelerated Augmented Lagrangian Method

no code implementations • NeurIPS 2020 • Yossi Arjevani, Joan Bruna, Bugra Can, Mert Gürbüzbalaban, Stefanie Jegelka, Hongzhou Lin

We introduce a framework for designing primal methods under the decentralized optimization setting where local functions are smooth and strongly convex.

Paper
Add Code

Symmetry & critical points for a model shallow neural network

no code implementations • 23 Mar 2020 • Yossi Arjevani, Michael Field

We consider the optimization problem associated with fitting two-layer ReLU networks with $k$ hidden neurons, where labels are assumed to be generated by a (teacher) neural network.

Paper
Add Code

On the Complexity of Minimizing Convex Finite Sums Without Using the Indices of the Individual Functions

no code implementations • 9 Feb 2020 • Yossi Arjevani, Amit Daniely, Stefanie Jegelka, Hongzhou Lin

Recent advances in randomized incremental methods for minimizing $L$-smooth $\mu$-strongly convex finite sums have culminated in tight complexity of $\tilde{O}((n+\sqrt{n L/\mu})\log(1/\epsilon))$ and $O(n+\sqrt{nL/\epsilon})$, where $\mu>0$ and $\mu=0$, respectively, and $n$ denotes the number of individual functions.

Paper
Add Code

On the Principle of Least Symmetry Breaking in Shallow ReLU Models

no code implementations • 26 Dec 2019 • Yossi Arjevani, Michael Field

We consider the optimization problem associated with fitting two-layer ReLU networks with respect to the squared loss, where labels are assumed to be generated by a target network.

Paper
Add Code

Lower Bounds for Non-Convex Stochastic Optimization

no code implementations • 5 Dec 2019 • Yossi Arjevani, Yair Carmon, John C. Duchi, Dylan J. Foster, Nathan Srebro, Blake Woodworth

We lower bound the complexity of finding $\epsilon$-stationary points (with gradient norm at most $\epsilon$) using stochastic first-order methods.

Stochastic Optimization

Paper
Add Code

A Tight Convergence Analysis for Stochastic Gradient Descent with Delayed Updates

no code implementations • 26 Jun 2018 • Yossi Arjevani, Ohad Shamir, Nathan Srebro

We provide tight finite-time convergence bounds for gradient descent and stochastic gradient descent on quadratic functions, when the gradients are delayed and reflect iterates from $\tau$ rounds ago.

Distributed Optimization

Paper
Add Code

Limitations on Variance-Reduction and Acceleration Schemes for Finite Sums Optimization

no code implementations • NeurIPS 2017 • Yossi Arjevani

We study the conditions under which one is able to efficiently apply variance-reduction and acceleration schemes on finite sums problems.

Paper
Add Code

Limitations on Variance-Reduction and Acceleration Schemes for Finite Sum Optimization

no code implementations • NeurIPS 2017 • Yossi Arjevani

We study the conditions under which one is able to efficiently apply variance-reduction and acceleration schemes on finite sum optimization problems.

Paper
Add Code

Oracle Complexity of Second-Order Methods for Finite-Sum Problems

no code implementations • ICML 2017 • Yossi Arjevani, Ohad Shamir

Finite-sum optimization problems are ubiquitous in machine learning, and are commonly solved using first-order methods which rely on gradient computations.

Second-order methods

Paper
Add Code

Dimension-Free Iteration Complexity of Finite Sum Optimization Problems

no code implementations • NeurIPS 2016 • Yossi Arjevani, Ohad Shamir

Many canonical machine learning problems boil down to a convex optimization problem with a finite sum structure.

Paper
Add Code

On the Iteration Complexity of Oblivious First-Order Optimization Algorithms

no code implementations • 11 May 2016 • Yossi Arjevani, Ohad Shamir

We consider a broad class of first-order optimization algorithms which are \emph{oblivious}, in the sense that their step sizes are scheduled regardless of the function under consideration, except for limited side-information such as smoothness or strong convexity parameters.

Paper
Add Code

Communication Complexity of Distributed Convex Learning and Optimization

no code implementations • NeurIPS 2015 • Yossi Arjevani, Ohad Shamir

We study the fundamental limits to communication-efficient distributed methods for convex learning and optimization, under different assumptions on the information available to individual machines, and the types of functions considered.

Paper
Add Code

On Lower and Upper Bounds for Smooth and Strongly Convex Optimization Problems

no code implementations • 23 Mar 2015 • Yossi Arjevani, Shai Shalev-Shwartz, Ohad Shamir

This, in turn, reveals a powerful connection between a class of optimization algorithms and the analytic theory of polynomials whereby new lower and upper bounds are derived.

valid

Paper
Add Code

On Lower and Upper Bounds in Smooth Strongly Convex Optimization - A Unified Approach via Linear Iterative Methods

no code implementations • 23 Oct 2014 • Yossi Arjevani

In this thesis we develop a novel framework to study smooth and strongly convex optimization algorithms, both deterministic and stochastic.

valid

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.