no code implementations • 8 Feb 2024 • Pierre Marion, Anna Korba, Peter Bartlett, Mathieu Blondel, Valentin De Bortoli, Arnaud Doucet, Felipe Llinares-López, Courtney Paquette, Quentin Berthet
We present a new algorithm to optimize distributions defined implicitly by parameterized stochastic diffusions.
no code implementations • 15 Jan 2024 • Aldo Pacchiano, Mohammad Ghavamzadeh, Peter Bartlett
In the setting that the constraint is in expectation, we further specialize our results to multi-armed bandits and propose a computationally efficient algorithm for this setting with regret analysis.
no code implementations • 12 Dec 2023 • Gautam Goel, Peter Bartlett
We revisit the problem of Kalman Filtering in linear dynamical systems and show that Transformers can approximate the Kalman Filter in a strong sense.
no code implementations • 24 Jun 2022 • Aldo Pacchiano, Ofir Nachum, Nilseh Tripuraneni, Peter Bartlett
In contrast with previous work that have studied multi task RL in other function approximation models, we show that in the presence of bilinear optimization oracle and finite state action spaces there exists a computationally efficient algorithm for multitask MatrixRL via a reduction to quadratic programming.
no code implementations • 16 Jun 2022 • Peter Bartlett, Piotr Indyk, Tal Wagner
Our techniques are general, and provide generalization bounds for many other recently proposed data-driven algorithms in numerical linear algebra, covering both sketching-based and multigrid-based methods.
no code implementations • 8 Mar 2022 • Juan C. Perdomo, Akshay Krishnamurthy, Peter Bartlett, Sham Kakade
Offline policy evaluation is a fundamental statistical problem in reinforcement learning that involves estimating the value function of some decision-making policy given data collected by a potentially different policy.
no code implementations • 8 Nov 2021 • Aldo Pacchiano, Peter Bartlett, Michael I. Jordan
We study the problem of information sharing and cooperation in Multi-Player Multi-Armed bandits.
no code implementations • 21 May 2021 • Jeffrey Chan, Aldo Pacchiano, Nilesh Tripuraneni, Yun S. Song, Peter Bartlett, Michael I. Jordan
Standard approaches to decision-making under uncertainty focus on sequential exploration of the space of decisions.
no code implementations • 19 Mar 2021 • Juan C. Perdomo, Max Simchowitz, Alekh Agarwal, Peter Bartlett
We study the problem of adaptive control of the linear quadratic regulator for systems in very high, or even infinite dimension.
no code implementations • NeurIPS 2021 • Aldo Pacchiano, Jonathan Lee, Peter Bartlett, Ofir Nachum
Since its introduction a decade ago, \emph{relative entropy policy search} (REPS) has demonstrated successful policy learning on a number of simulated and real-world robotic domains, not to mention providing algorithmic components used by many recently proposed reinforcement learning (RL) algorithms.
no code implementations • 24 Dec 2020 • Aldo Pacchiano, Christoph Dann, Claudio Gentile, Peter Bartlett
Finally, unlike recent efforts in model selection for linear stochastic bandits, our approach is versatile enough to also cover cases where the context information is generated by an adversarial environment, rather than a stochastic one.
no code implementations • ICML 2020 • Jonathan N. Lee, Aldo Pacchiano, Peter Bartlett, Michael. I. Jordan
Maximum a posteriori (MAP) inference in discrete-valued Markov random fields is a fundamental problem in machine learning that involves identifying the most likely configuration of random variables given a distribution.
no code implementations • 17 Jun 2020 • Aldo Pacchiano, Mohammad Ghavamzadeh, Peter Bartlett, Heinrich Jiang
We propose an upper-confidence bound algorithm for this problem, called optimistic pessimistic linear bandit (OPLB), and prove an $\widetilde{\mathcal{O}}(\frac{d\sqrt{T}}{\tau-c_0})$ bound on its $T$-round regret, where the denominator is the difference between the constraint threshold and the cost of a known feasible action.
no code implementations • ICLR 2020 • Raman Arora, Peter Bartlett, Poorya Mianjy, Nathan Srebro
In deep learning, we show that the data-dependent regularizer due to dropout directly controls the Rademacher complexity of the underlying class of deep neural networks.
no code implementations • 4 Feb 2019 • Yi-An Ma, Niladri Chatterji, Xiang Cheng, Nicolas Flammarion, Peter Bartlett, Michael. I. Jordan
We formulate gradient-based Markov chain Monte Carlo (MCMC) sampling as optimization on the space of probability measures, with Kullback-Leibler (KL) divergence as the objective functional.
1 code implementation • 29 Oct 2018 • Dong Yin, Kannan Ramchandran, Peter Bartlett
For binary linear classifiers, we prove tight bounds for the adversarial Rademacher complexity, and show that the adversarial Rademacher complexity is never smaller than its natural counterpart, and it has an unavoidable dimension dependence, unless the weight vector has bounded $\ell_1$ norm.
no code implementations • ICML 2018 • Peter Bartlett, Dave Helmbold, Philip Long
We provide polynomial bounds on the number of iterations for gradient descent to approximate the least squares matrix $\Phi$, in the case where the initial hypothesis $\Theta_1 = ... = \Theta_L = I$ has excess loss bounded by a small enough constant.
no code implementations • 14 Jun 2018 • Dong Yin, Yudong Chen, Kannan Ramchandran, Peter Bartlett
In this setting, the Byzantine machines may create fake local minima near a saddle point that is far away from any true local minimum, even when robust gradient estimators are used.
1 code implementation • ICML 2018 • Dong Yin, Yudong Chen, Kannan Ramchandran, Peter Bartlett
In particular, these algorithms are shown to achieve order-optimal statistical error rates for strongly convex losses.
1 code implementation • NeurIPS 2017 • Peter Bartlett, Dylan J. Foster, Matus Telgarsky
This paper presents a margin-based multiclass generalization bound for neural networks that scales with their margin-normalized "spectral complexity": their Lipschitz constant, meaning the product of the spectral norms of the weight matrices, times a certain correction factor.
no code implementations • 18 Jun 2017 • Dong Yin, Ashwin Pananjady, Max Lam, Dimitris Papailiopoulos, Kannan Ramchandran, Peter Bartlett
It has been experimentally observed that distributed implementations of mini-batch stochastic gradient descent (SGD) algorithms exhibit speedup saturation and decaying generalization ability beyond a particular batch-size.
no code implementations • 25 May 2017 • Xiang Cheng, Peter Bartlett
Langevin diffusion is a commonly used tool for sampling from a given distribution.
no code implementations • 19 May 2013 • Peter Bartlett, Peter Grunwald, Peter Harremoes, Fares Hedayati, Wojciech Kotlowski
Keywords: SNML Exchangeability, Exponential Family, Online Learning, Logarithmic Loss, Bayesian Strategy, Jeffreys Prior, Fisher Information1
no code implementations • 12 Apr 2013 • Yevgeny Seldin, Peter Bartlett, Koby Crammer
Advice-efficient prediction with expert advice (in analogy to label-efficient prediction) is a variant of prediction with expert advice game, where on each round of the game we are allowed to ask for advice of a limited number $M$ out of $N$ experts.