Search Results for author: Carlo Baldassi

Found 26 papers, 5 papers with code

Algorithmic Decision Processes

no code implementations • 5 May 2023 • Carlo Baldassi, Fabio Maccheroni, Massimo Marinacci, Marco Pirazzini

We develop a full-fledged analysis of an algorithmic decision process that, in a multialternative choice problem, produces computable choice probabilities and expected decision times.

Paper
Add Code

Typical and atypical solutions in non-convex neural networks with discrete and continuous weights

no code implementations • 26 Apr 2023 • Carlo Baldassi, Enrico M. Malatesta, Gabriele Perugini, Riccardo Zecchina

We analyze the geometry of the landscape of solutions in both models and find important similarities and differences.

Paper
Add Code

Systematically and efficiently improving $k$-means initialization by pairwise-nearest-neighbor smoothing

1 code implementation • 8 Feb 2022 • Carlo Baldassi

We present a meta-method for initializing (seeding) the $k$-means clustering algorithm called PNN-smoothing.

Clustering

Paper
Code

Deep Networks on Toroids: Removing Symmetries Reveals the Structure of Flat Regions in the Landscape Geometry

no code implementations • 7 Feb 2022 • Fabrizio Pittorino, Antonio Ferraro, Gabriele Perugini, Christoph Feinauer, Carlo Baldassi, Riccardo Zecchina

This lets us derive a meaningful notion of the flatness of minimizers and of the geodesic paths connecting them.

Paper
Add Code

Quantum Approximate Optimization Algorithm applied to the binary perceptron

no code implementations • 19 Dec 2021 • Pietro Torta, Glen B. Mbeng, Carlo Baldassi, Riccardo Zecchina, Giuseppe E. Santoro

We apply digitized Quantum Annealing (QA) and Quantum Approximate Optimization Algorithm (QAOA) to a paradigmatic task of supervised learning in artificial neural networks: the optimization of synaptic weights for the binary perceptron.

Paper
Add Code

Learning through atypical "phase transitions" in overparameterized neural networks

no code implementations • 1 Oct 2021 • Carlo Baldassi, Clarissa Lauditi, Enrico M. Malatesta, Rosalba Pacelli, Gabriele Perugini, Riccardo Zecchina

Current deep neural networks are highly overparameterized (up to billions of connection weights) and nonlinear.

Paper
Add Code

Unveiling the structure of wide flat minima in neural networks

no code implementations • 2 Jul 2021 • Carlo Baldassi, Clarissa Lauditi, Enrico M. Malatesta, Gabriele Perugini, Riccardo Zecchina

The success of deep learning has revealed the application potential of neural networks across the sciences and opened up fundamental theoretical problems.

Paper
Add Code

Wide flat minima and optimal generalization in classifying high-dimensional Gaussian mixtures

no code implementations • 27 Oct 2020 • Carlo Baldassi, Enrico M. Malatesta, Matteo Negri, Riccardo Zecchina

We analyze the connection between minimizers with good generalizing properties and high local entropy regions of a threshold-linear classifier in Gaussian mixtures with the mean squared error loss function.

Paper
Add Code

Ergodic Annealing

no code implementations • 1 Aug 2020 • Carlo Baldassi, Fabio Maccheroni, Massimo Marinacci, Marco Pirazzini

Simulated Annealing is the crowning glory of Markov Chain Monte Carlo Methods for the solution of NP-hard optimization problems in which the cost function is known.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Entropic gradient descent algorithms and wide flat minima

1 code implementation • ICLR 2021 • Fabrizio Pittorino, Carlo Lucibello, Christoph Feinauer, Gabriele Perugini, Carlo Baldassi, Elizaveta Demyanenko, Riccardo Zecchina

The properties of flat minima in the empirical risk landscape of neural networks have been debated for some time.

Paper
Code

Multialternative Neural Decision Processes

no code implementations • 3 May 2020 • Carlo Baldassi, Simone Cerreia-Vioglio, Fabio Maccheroni, Massimo Marinacci, Marco Pirazzini

We introduce an algorithmic decision process for multialternative choice that combines binary comparisons and Markovian exploration.

Paper
Add Code

Clustering of solutions in the symmetric binary perceptron

no code implementations • 15 Nov 2019 • Carlo Baldassi, Riccardo Della Vecchia, Carlo Lucibello, Riccardo Zecchina

The geometrical features of the (non-convex) loss landscape of neural network models are crucial in ensuring successful optimization and, most importantly, the capability to generalize well.

Clustering

Paper
Add Code

Natural representation of composite data with replicated autoencoders

no code implementations • 29 Sep 2019 • Matteo Negri, Davide Bergamini, Carlo Baldassi, Riccardo Zecchina, Christoph Feinauer

Generative processes in biology and other fields often produce data that can be regarded as resulting from a composition of basic features.

Paper
Add Code

Properties of the geometry of solutions and capacity of multi-layer neural networks with Rectified Linear Units activations

no code implementations • 17 Jul 2019 • Carlo Baldassi, Enrico M. Malatesta, Riccardo Zecchina

Rectified Linear Units (ReLU) have become the main model for the neural units in current deep learning systems.

Paper
Add Code

Shaping the learning landscape in neural networks around wide flat minima

no code implementations • 20 May 2019 • Carlo Baldassi, Fabrizio Pittorino, Riccardo Zecchina

In the case of SGD and cross-entropy loss, we show that a slow reduction of the norm of the weights along the learning process also leads to WFM.

Open-Ended Question Answering

Paper
Add Code

Recombinator-k-means: An evolutionary algorithm that exploits k-means++ for recombination

1 code implementation • 1 May 2019 • Carlo Baldassi

We compare this scheme with state-of-the-art alternative, a more standard genetic algorithm with deterministic pairwise-nearest-neighbor crossover and an elitist selection policy, of which we also provide an augmented and efficient implementation.

Clustering

Paper
Code

On the role of synaptic stochasticity in training low-precision neural networks

no code implementations • 26 Oct 2017 • Carlo Baldassi, Federica Gerace, Hilbert J. Kappen, Carlo Lucibello, Luca Saglietti, Enzo Tartaglione, Riccardo Zecchina

Stochasticity and limited precision of synaptic weights in neural network models are key aspects of both biological and hardware modeling of learning processes.

Paper
Add Code

Parle: parallelizing stochastic gradient descent

no code implementations • 3 Jul 2017 • Pratik Chaudhari, Carlo Baldassi, Riccardo Zecchina, Stefano Soatto, Ameet Talwalkar, Adam Oberman

We propose a new algorithm called Parle for parallel training of deep networks that converges 2-4x faster than a data-parallel implementation of SGD, while achieving significantly improved error rates that are nearly state-of-the-art on several benchmarks including CIFAR-10 and CIFAR-100, without introducing any additional hyper-parameters.

Paper
Add Code

Efficiency of quantum versus classical annealing in non-convex learning problems

no code implementations • 26 Jun 2017 • Carlo Baldassi, Riccardo Zecchina

Their energy landscapes is dominated by local minima that cause exponential slow down of classical thermal annealers while simulated quantum annealing converges efficiently to rare dense regions of optimal solutions.

Paper
Add Code

Entropy-SGD: Biasing Gradient Descent Into Wide Valleys

2 code implementations • 6 Nov 2016 • Pratik Chaudhari, Anna Choromanska, Stefano Soatto, Yann Lecun, Carlo Baldassi, Christian Borgs, Jennifer Chayes, Levent Sagun, Riccardo Zecchina

This paper proposes a new optimization algorithm called Entropy-SGD for training deep neural networks that is motivated by the local geometry of the energy landscape.

Paper
Code

A method to reduce the rejection rate in Monte Carlo Markov Chains

3 code implementations • 21 Aug 2016 • Carlo Baldassi

Our code for the Ising case is publicly available [https://github. com/carlobaldassi/RRRMC. jl], and extensible to user-defined models: it provides efficient implementations of standard Metropolis, the RRR method, the BKL method (extended to the case of continuous energy specra), and the waiting time method [Dall and Sibani Comput. Phys. Commun.

Statistical Mechanics Disordered Systems and Neural Networks

Paper
Code

Unreasonable Effectiveness of Learning Neural Networks: From Accessible States and Robust Ensembles to Basic Algorithmic Schemes

no code implementations • 20 May 2016 • Carlo Baldassi, Christian Borgs, Jennifer Chayes, Alessandro Ingrosso, Carlo Lucibello, Luca Saglietti, Riccardo Zecchina

We define a novel measure, which we call the "robust ensemble" (RE), which suppresses trapping by isolated configurations and amplifies the role of these dense regions.

Paper
Add Code

Learning may need only a few bits of synaptic precision

no code implementations • 12 Feb 2016 • Carlo Baldassi, Federica Gerace, Carlo Lucibello, Luca Saglietti, Riccardo Zecchina

Learning in neural networks poses peculiar challenges when using discretized rather then continuous synaptic states.

Paper
Add Code

Local entropy as a measure for sampling solutions in Constraint Satisfaction Problems

no code implementations • 18 Nov 2015 • Carlo Baldassi, Alessandro Ingrosso, Carlo Lucibello, Luca Saglietti, Riccardo Zecchina

We introduce a novel Entropy-driven Monte Carlo (EdMC) strategy to efficiently sample solutions of random Constraint Satisfaction Problems (CSPs).

Paper
Add Code

Subdominant Dense Clusters Allow for Simple Learning and High Computational Performance in Neural Networks with Discrete Synapses

no code implementations • 18 Sep 2015 • Carlo Baldassi, Alessandro Ingrosso, Carlo Lucibello, Luca Saglietti, Riccardo Zecchina

We also show that the dense regions are surprisingly accessible by simple learning protocols, and that these synaptic configurations are robust to perturbations and generalize better than typical solutions.

Paper
Add Code

A Max-Sum algorithm for training discrete neural networks

no code implementations • 20 May 2015 • Carlo Baldassi, Alfredo Braunstein

The algorithm we present performs as well as BP on binary perceptron learning problems, and may be better suited to address the problem on fully-connected two-layer networks, since inherent symmetries in two layer networks are naturally broken using the MS approach.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.