Search Results for author: Donald Goldfarb

Found 19 papers, 4 papers with code

Layer-wise Adaptive Step-Sizes for Stochastic First-Order Methods for Deep Learning

no code implementations • 23 May 2023 • Achraf Bahamou, Donald Goldfarb

We propose a new per-layer adaptive step-size procedure for stochastic first-order optimization methods for minimizing empirical loss functions in deep learning, eliminating the need for the user to tune the learning rate (LR).

Paper
Add Code

A Mini-Block Fisher Method for Deep Neural Networks

no code implementations • 8 Feb 2022 • Achraf Bahamou, Donald Goldfarb, Yi Ren

Specifically, our method uses a block-diagonal approximation to the empirical Fisher matrix, where for each layer in the DNN, whether it is convolutional or feed-forward and fully connected, the associated diagonal block is itself block-diagonal and is composed of a large number of mini-blocks of modest size.

Second-order methods

Paper
Add Code

Tensor Normal Training for Deep Learning Models

1 code implementation • NeurIPS 2021 • Yi Ren, Donald Goldfarb

Based on the so-called tensor normal (TN) distribution, we propose and analyze a brand new approximate natural gradient method, Tensor Normal Training (TNT), which like Shampoo, only requires knowledge of the shape of the training parameters.

Second-order methods

Paper
Code

Kronecker-factored Quasi-Newton Methods for Deep Learning

no code implementations • 12 Feb 2021 • Yi Ren, Achraf Bahamou, Donald Goldfarb

Several improvements to the methods in Goldfarb et al. (2020) are also proposed that can be applied to both MLPs and CNNs.

Second-order methods

Paper
Add Code

Practical Quasi-Newton Methods for Training Deep Neural Networks

1 code implementation • NeurIPS 2020 • Donald Goldfarb, Yi Ren, Achraf Bahamou

We consider the development of practical stochastic quasi-Newton, and in particular Kronecker-factored block-diagonal BFGS and L-BFGS methods, for training deep neural networks (DNNs).

Paper
Code

A Dynamic Sampling Adaptive-SGD Method for Machine Learning

no code implementations • 31 Dec 2019 • Achraf Bahamou, Donald Goldfarb

We also propose an adaptive version of ADAM that eliminates the need to tune the base learning rate and compares favorably to fine-tuned ADAM on training DNNs.

BIG-bench Machine Learning Stochastic Optimization

Paper
Add Code

Efficient Subsampled Gauss-Newton and Natural Gradient Methods for Training Neural Networks

no code implementations • 5 Jun 2019 • Yi Ren, Donald Goldfarb

We present practical Levenberg-Marquardt variants of Gauss-Newton and natural gradient methods for solving non-convex optimization problems that arise in training deep neural networks involving enormous numbers of variables and huge data sets.

Paper
Add Code

Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models: Extension

1 code implementation • NeurIPS 2019 • Yunfei Teng, Wenbo Gao, Francois Chalus, Anna Choromanska, Donald Goldfarb, Adrian Weller

Finally, we implement an asynchronous version of our algorithm and extend it to the multi-leader setting, where we form groups of workers, each represented by its own local leader (the best performer in a group), and update each worker with a corrective direction comprised of two attractive forces: one to the local, and one to the global leader (the best performer among all workers).

Distributed Optimization

Paper
Code

Increasing Iterate Averaging for Solving Saddle-Point Problems

no code implementations • 26 Mar 2019 • Yuan Gao, Christian Kroer, Donald Goldfarb

In particular, the increasing averages consistently outperform the uniform averages in all test problems by orders of magnitude.

Image Denoising

Paper
Add Code

Stochastic Adaptive Quasi-Newton Methods for Minimizing Expected Values

no code implementations • ICML 2017 • Chaoxu Zhou, Wenbo Gao, Donald Goldfarb

We propose a novel class of stochastic, adaptive methods for minimizing self-concordant functions which can be expressed as an expected value.

Paper
Add Code

Stochastic Quasi-Newton Methods for Nonconvex Stochastic Optimization

no code implementations • 5 Jul 2016 • Xiao Wang, Shiqian Ma, Donald Goldfarb, Wei Liu

In this paper we study stochastic quasi-Newton methods for nonconvex stochastic optimization, where we assume that noisy information about the gradients of the objective function is available via a stochastic first-order oracle (SFO).

Binary Classification General Classification +1

Paper
Add Code

Scalable Robust Matrix Recovery: Frank-Wolfe Meets Proximal Methods

no code implementations • 29 Mar 2014 • Cun Mu, Yuqian Zhang, John Wright, Donald Goldfarb

Recovering matrices from compressive and grossly corrupted observations is a fundamental problem in robust statistics, with rich applications in computer vision and machine learning.

Paper
Add Code

Robust Low-rank Tensor Recovery: Models and Algorithms

no code implementations • 24 Nov 2013 • Donald Goldfarb, Zhiwei Qin

Robust tensor recovery plays an instrumental role in robustifying tensor decompositions for multilinear data analysis against outliers, gross corruptions and missing values and has a diverse array of applications.

Paper
Add Code

Efficient Algorithms for Robust and Stable Principal Component Pursuit Problems

no code implementations • 26 Sep 2013 • Necdet Serhat Aybat, Donald Goldfarb, Shiqian Ma

Moreover, if the observed data matrix has also been corrupted by a dense noise matrix in addition to gross sparse error, then the stable principal component pursuit (SPCP) problem is solved to recover the low-rank matrix.

Optimization and Control

Paper
Add Code

Square Deal: Lower Bounds and Improved Relaxations for Tensor Recovery

no code implementations • 22 Jul 2013 • Cun Mu, Bo Huang, John Wright, Donald Goldfarb

The most popular convex relaxation of this problem minimizes the sum of the nuclear norms of the unfoldings of the tensor.

Paper
Add Code

Fast First-Order Methods for Stable Principal Component Pursuit

no code implementations • 11 May 2011 • Necdet Serhat Aybat, Donald Goldfarb, Garud Iyengar

The stable principal component pursuit (SPCP) problem is a non-smooth convex optimization problem, the solution of which has been shown both in theory and in practice to enable one to recover the low rank and sparse components of a matrix whose elements have been corrupted by Gaussian noise.

Optimization and Control

Paper
Add Code

Sparse Inverse Covariance Selection via Alternating Linearization Methods

no code implementations • NeurIPS 2010 • Katya Scheinberg, Shiqian Ma, Donald Goldfarb

Gaussian graphical models are of great interest in statistical learning.

Paper
Add Code

Fast Alternating Linearization Methods for Minimizing the Sum of Two Convex Functions

no code implementations • 23 Dec 2009 • Donald Goldfarb, Shiqian Ma, Katya Scheinberg

We present in this paper first-order alternating linearization algorithms based on an alternating direction augmented Lagrangian approach for minimizing the sum of two convex functions.

Paper
Add Code

Fixed Point and Bregman Iterative Methods for Matrix Rank Minimization

1 code implementation • 11 May 2009 • Shiqian Ma, Donald Goldfarb, Lifeng Chen

The tightest convex relaxation of this problem is the linearly constrained nuclear norm minimization.

Optimization and Control Information Theory Information Theory

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.