Carlo LucibelloSapienza University of Rome | la sapienza · Department of Physics
Carlo Lucibello
PhD
About
50
Publications
3,803
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
833
Citations
Introduction
Additional affiliations
November 2011 - present
Publications
Publications (50)
Generative diffusion processes are state-of-the-art machine learning models deeply connected with fundamental concepts in statistical physics. Depending on the dataset size and the capacity of the network, their behavior is known to transition from an associative memory regime to a generalization phase in a phenomenon that has been described as a g...
Noiseless compressive sensing is a two-steps setting that allows for undersampling a sparse signal and then reconstructing it without loss of information. The LASSO algorithm, based on $\lone$ regularization, provides an efficient and robust to address this problem, but it fails in the regime of very high compression rate. Here we present two algor...
It has been recently shown that a learning transition happens when a Hopfield Network stores examples generated as superpositions of random features, where new attractors corresponding to such features appear in the model. In this work we reveal that the network also develops attractors corresponding to previously unseen examples generated with the...
We present InvMSAFold, a method for generating a diverse set of protein sequences that fold into a single structure. For a given structure, InvMSAFold defines a probability distribution over the space of sequences, capturing the amino acid covariances observed in Multiple Sequence Alignments (MSA) of homologous proteins. This allows for the generat...
Recent generalizations of the Hopfield model of associative memories are able to store a number P of random patterns that grows exponentially with the number N of neurons, P=exp(αN). Besides the huge storage capacity, another interesting feature of these networks is their connection to the attention mechanism which is part of the Transformer archit...
The Hopfield model is a paradigmatic model of neural networks that has been analyzed for many decades in the statistical physics, neuroscience, and machine learning communities. Inspired by the manifold hypothesis in machine learning, we propose and investigate a generalization of the standard setting that we name random-features Hopfield model. He...
Empirical studies on the landscape of neural networks have shown that low-energy configurations are often found in complex connected structures, where zero-energy paths between pairs of distant solutions can be constructed. Here, we consider the spherical negative perceptron, a prototypical nonconvex neural network model framed as a continuous cons...
Empirical studies on the landscape of neural networks have shown that low-energy configurations are often found in complex connected structures, where zero-energy paths between pairs of distant solutions can be constructed. Here we consider the spherical negative perceptron, a prototypical non-convex neural network model framed as a continuous cons...
Recent generalizations of the Hopfield model of associative memories are able to store a number $P$ of random patterns that grows exponentially with the number $N$ of neurons, $P=\exp(\alpha N)$. Besides the huge storage capacity, another interesting feature of these networks is their connection to the attention mechanism which is part of the Trans...
Noiseless compressive sensing is a protocol that enables undersampling and later recovery of a signal without loss of information. This compression is possible because the signal is usually sufficiently sparse in a given basis. Currently, the algorithm offering the best tradeoff between compression rate, robustness, and speed for compressive sensin...
Artificial networks have been studied through the prism of statistical mechanics as disordered systems since the 80s, starting from the simple models of Hopfield's associative memory and the single-neuron perceptron classifier. Assuming data is generated by a teacher model, asymptotic generalisation predictions were originally derived using the rep...
The Hopfield model has a long-standing tradition in statistical physics, being one of the few neural networks for which a theory is available. Extending the theory of Hopfield models for correlated data could help understand the success of deep neural networks, for instance describing how they extract features from data. Motivated by this, we propo...
Message-passing algorithms based on the Belief Propagation (BP) equations constitute a well-known distributed computational scheme. They yield exact marginals on tree-like graphical models and have also proven to be effective in many problems defined on loopy graphs, from inference to optimization, from signal processing to clustering. The BP-based s...
Many different types of generative models for protein sequences have been proposed in literature. Their uses include the prediction of mutational effects, protein design and the prediction of structural properties. Neural network (NN) architectures have shown great performances, commonly attributed to the capacity to extract non-trivial higher-orde...
The spin-glass transition in a field in finite dimension is analyzed directly at zero temperature using a perturbative loop expansion around the Bethe lattice solution. The loop expansion is generated by the M-layer construction whose first diagrams are evaluated numerically and analytically. The generalized Ginzburg criterion reveals that the uppe...
Pairwise models like the Ising model or the generalized Potts model have found many successful applications in fields like physics, biology, and economics. Closely connected is the problem of inverse statistical mechanics, where the goal is to infer the parameters of such models given observed data. An open problem in this field is the question of...
The properties of flat minima in the empirical risk landscape of neural networks have been debated for some time. Increasing evidence suggests they possess better generalization capabilities with respect to sharp ones. In this work we first discuss the relationship between alternative measures of flatness: the local entropy , which is useful for an...
A bstract
Many different types of generative models for protein sequences have been proposed in literature. Their uses include the prediction of mutational effects, protein design and the prediction of structural properties. Neural network (NN) architectures have shown great performances, commonly attributed to the capacity to extract non-trivial h...
The spin-glass transition in a field in finite dimension is analyzed directly at zero temperature using a perturbative loop expansion around the Bethe lattice solution. The loop expansion is generated by the $M$-layer construction whose first diagrams are evaluated numerically and analytically. The Ginzburg criterion, from both the paramagnetic and...
In generalized linear estimation (GLE) problems, we seek to estimate a signal that is observed through a linear transform followed by a component-wise, possibly nonlinear and noisy, channel. In the Bayesian optimal setting, generalized approximate message passing (GAMP) is known to achieve optimal performance for GLE. However, its performance can s...
Pairwise models like the Ising model or the generalized Potts model have found many successful applications in fields like physics, biology, and economics. Closely connected is the problem of inverse statistical mechanics, where the goal is to infer the parameters of such models given observed data. An open problem in this field is the question of...
The properties of flat minima in the empirical risk landscape of neural networks have been debated for some time. Increasing evidence suggests they possess better generalization capabilities with respect to sharp ones. First, we discuss Gaussian mixture classification models and show analytically that there exist Bayes optimal pointwise estimators...
Significance
The ϵ expansion around the upper critical dimension is a standard tool for studying critical phenomena of models defined on finite-dimensional lattices. However, it faces problems in describing strongly disordered models. Here we use a loop expansion around the Bethe solution, an advanced mean-field theory, since it provides a complete...
The geometrical features of the (non-convex) loss landscape of neural network models are crucial in ensuring successful optimization and, most importantly, the capability to generalize well. While minimizers' flatness consistently correlates with good generalization, there has been little rigorous work in exploring the condition of existence of suc...
We apply to the Random Field Ising Model at zero temperature (T= 0) the perturbative loop expansion around the Bethe solution. A comparison with the standard epsilon-expansion is made, highlighting the key differences that make the new expansion much more appropriate to correctly describe strongly disordered systems, especially those controlled by...
In Generalized Linear Estimation (GLE) problems, we seek to estimate a signal that is observed through a linear transform followed by a component-wise, possibly nonlinear and noisy, channel. In the Bayesian optimal setting, Generalized Approximate Message Passing (GAMP) is known to achieve optimal performance for GLE. However, its performance can s...
The training of stochastic neural network models with binary ($\pm1$) weights and activations via a deterministic and continuous surrogate network is investigated. We derive, using mean field theory, a set of scalar equations describing how input signals propagate through the surrogate network. The equations reveal that these continuous models exhi...
We consider two formulations of the random-link fractional matching problem, a relaxed version of the more standard random-link (integer) matching problem. In one formulation, we allow each node to be linked to itself in the optimal matching configuration. In the other one, on the contrary, such a link is forbidden. Both problems have the same asym...
Stochasticity and limited precision of synaptic weights in neural network models is a key aspect of both biological and hardware modeling of learning processes. Here we show that a neural network model with stochastic binary weights naturally gives prominence to exponentially rare dense regions of solutions with a number of desirable properties suc...
For every physical model defined on a generic graph or factor graph, the Bethe $M$-layer construction allows building a different model for which the Bethe approximation is exact in the large $M$ limit and it coincides with the original model for $M=1$. The $1/M$ perturbative series is then expressed by a diagrammatic loop expansion in terms of so-...
The matching problem is a notorious combinatorial optimization problem that has attracted for many years the attention of the statistical physics community. Here we analyze the Euclidean version of the problem, i.e. the optimal matching problem between points randomly distributed on a $d$-dimensional Euclidean space, where the cost to minimize depe...
Significance
Artificial neural networks are some of the most widely used tools in data science. Learning is, in principle, a hard problem in these systems, but in practice heuristic algorithms often find solutions with good generalization properties. We propose an explanation of this good performance in terms of a nonequilibrium statistical physics...
We introduce a novel Entropy-driven Monte Carlo (EdMC) strategy to
efficiently sample solutions of random Constraint Satisfaction Problems (CSPs).
First, we extend a recent result that, using a large-deviation analysis, shows
that the geometry of the space of solutions of the Binary Perceptron Learning
Problem (a prototypical CSP), contains regions...
Learning in neural networks poses peculiar challenges when using discretized rather then continuous synaptic states. The choice of discrete synapses is motivated by biological reasoning and experiments, and possibly by hardware implementation considerations as well. In this paper we extend a previous large deviations analysis which unveiled the exi...
We show that discrete synaptic weights can be efficiently used for learning
in large scale neural systems, and lead to unanticipated computational
performance. We focus on the representative case of learning random patterns
with binary synapses in single layer networks. The standard statistical
analysis shows that this problem is exponentially domi...
This PhD thesis has the following structure: Chapter 1 - General
introduction; Chapter 2 - Preliminaries; Chapter 3 - The Replicated Transfer
Matrix; Chapter 4 - Finite Size Corrections On Random Graphs; Chapter 5 - The
Random Field Ising Model; Chapter 6 - The Euclidean Assignment Problem; Chapter
7 - The Euclidean Matching Problem; Chapter 8 - Th...
We propose a simple yet very predictive form, based on a Poisson's equation,
for the functional dependence of the cost from the density of points in the
Euclidean bipartite matching problem. This leads, for quadratic costs, to the
analytic prediction of the large $N$ limit of the average cost in dimension
$d=1,2$ and of the subleading correction in...
The presence of a random magnetic field in ferromagnetic systems leads, in
the broken phase, to an anomalous $O(\sqrt{1/N})$ convergence of some
thermodynamic quantities to their asymptotic limits. Here we show a general
method, based on the replica trick, to compute analytically the $O(\sqrt{1/N})$
finite size correction to the average free energy...
We derive the analytical expression for the first finite size correction to
the average free energy of disordered Ising models on random regular graphs.
The formula can be physically interpreted as a weighted sum over all non
self-intersecting loops in the graph, the weight being the free-energy shift
due to the addition of the loop to an infinite...
We analyse the asymptotic behaviour of random instances of the Maximum Set
Packing (MSP) optimization problem, also known as Maximum Matching or Maximum
Strong Independent Set on Hypergraphs. We give an analytical prediction of the
MSPs size using the 1RSB cavity method from statistical mechanics of disordered
systems. We also propose a heuristic a...
Using a formalism based on the spectral decomposition of the replicated
transfer matrix for disordered Ising models, we obtain several results that
apply both to isolated one-dimensional systems and to locally tree-like graph
and factor graph (p-spin) ensembles. We present exact analytical expressions,
which can be efficiently approximated numerica...
We study the finite-size corrections to the free-energy density in
disordered spin systems on sparse random graphs, using both replica
theory and the cavity method. We derive analytical expressions for the
O(1/N) corrections in the replica symmetric phase as a linear
combination of the free energies of open and closed chains. We perform a
numerical...