no code implementations • 22 Oct 2021 • Simon Alford, Anshula Gandhi, Akshay Rangamani, Andrzej Banburski, Tony Wang, Sylee Dandekar, John Chin, Tomaso Poggio, Peter Chin
More specifically, we extend existing execution-guided program synthesis approaches with deductive reasoning based on function inverse semantics to enable a neural-guided bidirectional search algorithm.
no code implementations • NeurIPS Workshop SVRHM 2021 • Jonathan M Gant, Andrzej Banburski, Arturo Deza
The FTT module was added to a VGG-11 CNN architecture and ten random initializations were trained on 20-class subsets of the Places and EcoSet datasets for scene and object classification respectively.
no code implementations • 21 Jul 2021 • Andrzej Banburski, Fernanda De La Torre, Nishka Pant, Ishana Shastri, Tomaso Poggio
Recent theoretical results show that gradient descent on deep neural networks under exponential loss functions locally maximizes classification margin, which is equivalent to minimizing the norm of the weight matrices under margin constraints.
no code implementations • NeurIPS Workshop LMCA 2020 • Andrzej Banburski, Anshula Gandhi, Simon Alford, Sylee Dandekar, Sang Chin, tomaso a poggio
We argue that this can be achieved by a modular system –– one that can adapt to solving different problems by changing only the modules chosen and the order in which those modules are applied to the problem.
2 code implementations • NeurIPS 2020 • Manish V. Reddy, Andrzej Banburski, Nishka Pant, Tomaso Poggio
A convolutional neural network strongly robust to adversarial perturbations at reasonable computational and performance cost has not yet been demonstrated.
no code implementations • 24 Jun 2020 • Arturo Deza, Qianli Liao, Andrzej Banburski, Tomaso Poggio
For object recognition we find, as expected, that scrambling does not affect the performance of shallow or deep fully connected networks contrary to the out-performance of convolutional networks.
no code implementations • 12 Dec 2019 • Tomaso Poggio, Gil Kur, Andrzej Banburski
In solving a system of $n$ linear equations in $d$ variables $Ax=b$, the condition number of the $n, d$ matrix $A$ measures how much errors in the data $b$ affect the solution $x$.
no code implementations • 25 Aug 2019 • Tomaso Poggio, Andrzej Banburski, Qianli Liao
In approximation theory both shallow and deep networks have been shown to approximate any continuous functions on a bounded domain at the expense of an exponential number of parameters (exponential in the dimensionality of the function).
no code implementations • 12 Mar 2019 • Andrzej Banburski, Qianli Liao, Brando Miranda, Lorenzo Rosasco, Fernanda De La Torre, Jack Hidary, Tomaso Poggio
In particular, gradient descent induces a dynamics of the normalized weights which converge for $t \to \infty$ to an equilibrium which corresponds to a minimum norm (or maximum margin) solution.
3 code implementations • 25 Jul 2018 • Qianli Liao, Brando Miranda, Andrzej Banburski, Jack Hidary, Tomaso Poggio
Given two networks with the same training loss on a dataset, when would they have drastically different test losses and errors?
no code implementations • 29 Jun 2018 • Tomaso Poggio, Qianli Liao, Brando Miranda, Andrzej Banburski, Xavier Boix, Jack Hidary
Here we prove a similar result for nonlinear multilayer DNNs near zero minima of the empirical loss.