Carlo Lucibello

Carlo Lucibello
Sapienza University of Rome | la sapienza · Department of Physics

PhD

About

50
Publications
3,803
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
833
Citations
Additional affiliations
November 2011 - present
Sapienza University of Rome
Position
  • PhD Student

Publications

Publications (50)
Preprint
Full-text available
Generative diffusion processes are state-of-the-art machine learning models deeply connected with fundamental concepts in statistical physics. Depending on the dataset size and the capacity of the network, their behavior is known to transition from an associative memory regime to a generalization phase in a phenomenon that has been described as a g...
Preprint
Noiseless compressive sensing is a two-steps setting that allows for undersampling a sparse signal and then reconstructing it without loss of information. The LASSO algorithm, based on $\lone$ regularization, provides an efficient and robust to address this problem, but it fails in the regime of very high compression rate. Here we present two algor...
Preprint
It has been recently shown that a learning transition happens when a Hopfield Network stores examples generated as superpositions of random features, where new attractors corresponding to such features appear in the model. In this work we reveal that the network also develops attractors corresponding to previously unseen examples generated with the...
Preprint
Full-text available
We present InvMSAFold, a method for generating a diverse set of protein sequences that fold into a single structure. For a given structure, InvMSAFold defines a probability distribution over the space of sequences, capturing the amino acid covariances observed in Multiple Sequence Alignments (MSA) of homologous proteins. This allows for the generat...
Article
Recent generalizations of the Hopfield model of associative memories are able to store a number P of random patterns that grows exponentially with the number N of neurons, P=exp(αN). Besides the huge storage capacity, another interesting feature of these networks is their connection to the attention mechanism which is part of the Transformer archit...
Article
The Hopfield model is a paradigmatic model of neural networks that has been analyzed for many decades in the statistical physics, neuroscience, and machine learning communities. Inspired by the manifold hypothesis in machine learning, we propose and investigate a generalization of the standard setting that we name random-features Hopfield model. He...
Article
Empirical studies on the landscape of neural networks have shown that low-energy configurations are often found in complex connected structures, where zero-energy paths between pairs of distant solutions can be constructed. Here, we consider the spherical negative perceptron, a prototypical nonconvex neural network model framed as a continuous cons...
Preprint
Empirical studies on the landscape of neural networks have shown that low-energy configurations are often found in complex connected structures, where zero-energy paths between pairs of distant solutions can be constructed. Here we consider the spherical negative perceptron, a prototypical non-convex neural network model framed as a continuous cons...
Preprint
Recent generalizations of the Hopfield model of associative memories are able to store a number $P$ of random patterns that grows exponentially with the number $N$ of neurons, $P=\exp(\alpha N)$. Besides the huge storage capacity, another interesting feature of these networks is their connection to the attention mechanism which is part of the Trans...
Preprint
Full-text available
Noiseless compressive sensing is a protocol that enables undersampling and later recovery of a signal without loss of information. This compression is possible because the signal is usually sufficiently sparse in a given basis. Currently, the algorithm offering the best tradeoff between compression rate, robustness, and speed for compressive sensin...
Preprint
Artificial networks have been studied through the prism of statistical mechanics as disordered systems since the 80s, starting from the simple models of Hopfield's associative memory and the single-neuron perceptron classifier. Assuming data is generated by a teacher model, asymptotic generalisation predictions were originally derived using the rep...
Preprint
The Hopfield model has a long-standing tradition in statistical physics, being one of the few neural networks for which a theory is available. Extending the theory of Hopfield models for correlated data could help understand the success of deep neural networks, for instance describing how they extract features from data. Motivated by this, we propo...
Article
Full-text available
Message-passing algorithms based on the Belief Propagation (BP) equations constitute a well-known distributed computational scheme. They yield exact marginals on tree-like graphical models and have also proven to be effective in many problems defined on loopy graphs, from inference to optimization, from signal processing to clustering. The BP-based s...
Article
Full-text available
Many different types of generative models for protein sequences have been proposed in literature. Their uses include the prediction of mutational effects, protein design and the prediction of structural properties. Neural network (NN) architectures have shown great performances, commonly attributed to the capacity to extract non-trivial higher-orde...
Article
The spin-glass transition in a field in finite dimension is analyzed directly at zero temperature using a perturbative loop expansion around the Bethe lattice solution. The loop expansion is generated by the M-layer construction whose first diagrams are evaluated numerically and analytically. The generalized Ginzburg criterion reveals that the uppe...
Article
Pairwise models like the Ising model or the generalized Potts model have found many successful applications in fields like physics, biology, and economics. Closely connected is the problem of inverse statistical mechanics, where the goal is to infer the parameters of such models given observed data. An open problem in this field is the question of...
Article
The properties of flat minima in the empirical risk landscape of neural networks have been debated for some time. Increasing evidence suggests they possess better generalization capabilities with respect to sharp ones. In this work we first discuss the relationship between alternative measures of flatness: the local entropy , which is useful for an...
Preprint
Full-text available
A bstract Many different types of generative models for protein sequences have been proposed in literature. Their uses include the prediction of mutational effects, protein design and the prediction of structural properties. Neural network (NN) architectures have shown great performances, commonly attributed to the capacity to extract non-trivial h...
Preprint
Full-text available
The spin-glass transition in a field in finite dimension is analyzed directly at zero temperature using a perturbative loop expansion around the Bethe lattice solution. The loop expansion is generated by the $M$-layer construction whose first diagrams are evaluated numerically and analytically. The Ginzburg criterion, from both the paramagnetic and...
Article
In generalized linear estimation (GLE) problems, we seek to estimate a signal that is observed through a linear transform followed by a component-wise, possibly nonlinear and noisy, channel. In the Bayesian optimal setting, generalized approximate message passing (GAMP) is known to achieve optimal performance for GLE. However, its performance can s...
Preprint
Pairwise models like the Ising model or the generalized Potts model have found many successful applications in fields like physics, biology, and economics. Closely connected is the problem of inverse statistical mechanics, where the goal is to infer the parameters of such models given observed data. An open problem in this field is the question of...
Preprint
The properties of flat minima in the empirical risk landscape of neural networks have been debated for some time. Increasing evidence suggests they possess better generalization capabilities with respect to sharp ones. First, we discuss Gaussian mixture classification models and show analytically that there exist Bayes optimal pointwise estimators...
Article
Significance The ϵ expansion around the upper critical dimension is a standard tool for studying critical phenomena of models defined on finite-dimensional lattices. However, it faces problems in describing strongly disordered models. Here we use a loop expansion around the Bethe solution, an advanced mean-field theory, since it provides a complete...
Preprint
The geometrical features of the (non-convex) loss landscape of neural network models are crucial in ensuring successful optimization and, most importantly, the capability to generalize well. While minimizers' flatness consistently correlates with good generalization, there has been little rigorous work in exploring the condition of existence of suc...
Preprint
We apply to the Random Field Ising Model at zero temperature (T= 0) the perturbative loop expansion around the Bethe solution. A comparison with the standard epsilon-expansion is made, highlighting the key differences that make the new expansion much more appropriate to correctly describe strongly disordered systems, especially those controlled by...
Preprint
In Generalized Linear Estimation (GLE) problems, we seek to estimate a signal that is observed through a linear transform followed by a component-wise, possibly nonlinear and noisy, channel. In the Bayesian optimal setting, Generalized Approximate Message Passing (GAMP) is known to achieve optimal performance for GLE. However, its performance can s...
Preprint
The training of stochastic neural network models with binary ($\pm1$) weights and activations via a deterministic and continuous surrogate network is investigated. We derive, using mean field theory, a set of scalar equations describing how input signals propagate through the surrogate network. The equations reveal that these continuous models exhi...
Article
Full-text available
We consider two formulations of the random-link fractional matching problem, a relaxed version of the more standard random-link (integer) matching problem. In one formulation, we allow each node to be linked to itself in the optimal matching configuration. In the other one, on the contrary, such a link is forbidden. Both problems have the same asym...
Article
Full-text available
Stochasticity and limited precision of synaptic weights in neural network models is a key aspect of both biological and hardware modeling of learning processes. Here we show that a neural network model with stochastic binary weights naturally gives prominence to exponentially rare dense regions of solutions with a number of desirable properties suc...
Article
Full-text available
For every physical model defined on a generic graph or factor graph, the Bethe $M$-layer construction allows building a different model for which the Bethe approximation is exact in the large $M$ limit and it coincides with the original model for $M=1$. The $1/M$ perturbative series is then expressed by a diagrammatic loop expansion in terms of so-...
Article
The matching problem is a notorious combinatorial optimization problem that has attracted for many years the attention of the statistical physics community. Here we analyze the Euclidean version of the problem, i.e. the optimal matching problem between points randomly distributed on a $d$-dimensional Euclidean space, where the cost to minimize depe...
Article
Full-text available
Significance Artificial neural networks are some of the most widely used tools in data science. Learning is, in principle, a hard problem in these systems, but in practice heuristic algorithms often find solutions with good generalization properties. We propose an explanation of this good performance in terms of a nonequilibrium statistical physics...
Article
Full-text available
We introduce a novel Entropy-driven Monte Carlo (EdMC) strategy to efficiently sample solutions of random Constraint Satisfaction Problems (CSPs). First, we extend a recent result that, using a large-deviation analysis, shows that the geometry of the space of solutions of the Binary Perceptron Learning Problem (a prototypical CSP), contains regions...
Article
Full-text available
Learning in neural networks poses peculiar challenges when using discretized rather then continuous synaptic states. The choice of discrete synapses is motivated by biological reasoning and experiments, and possibly by hardware implementation considerations as well. In this paper we extend a previous large deviations analysis which unveiled the exi...
Article
Full-text available
We show that discrete synaptic weights can be efficiently used for learning in large scale neural systems, and lead to unanticipated computational performance. We focus on the representative case of learning random patterns with binary synapses in single layer networks. The standard statistical analysis shows that this problem is exponentially domi...
Article
This PhD thesis has the following structure: Chapter 1 - General introduction; Chapter 2 - Preliminaries; Chapter 3 - The Replicated Transfer Matrix; Chapter 4 - Finite Size Corrections On Random Graphs; Chapter 5 - The Random Field Ising Model; Chapter 6 - The Euclidean Assignment Problem; Chapter 7 - The Euclidean Matching Problem; Chapter 8 - Th...
Article
Full-text available
We propose a simple yet very predictive form, based on a Poisson's equation, for the functional dependence of the cost from the density of points in the Euclidean bipartite matching problem. This leads, for quadratic costs, to the analytic prediction of the large $N$ limit of the average cost in dimension $d=1,2$ and of the subleading correction in...
Article
Full-text available
The presence of a random magnetic field in ferromagnetic systems leads, in the broken phase, to an anomalous $O(\sqrt{1/N})$ convergence of some thermodynamic quantities to their asymptotic limits. Here we show a general method, based on the replica trick, to compute analytically the $O(\sqrt{1/N})$ finite size correction to the average free energy...
Article
Full-text available
We derive the analytical expression for the first finite size correction to the average free energy of disordered Ising models on random regular graphs. The formula can be physically interpreted as a weighted sum over all non self-intersecting loops in the graph, the weight being the free-energy shift due to the addition of the loop to an infinite...
Article
Full-text available
We analyse the asymptotic behaviour of random instances of the Maximum Set Packing (MSP) optimization problem, also known as Maximum Matching or Maximum Strong Independent Set on Hypergraphs. We give an analytical prediction of the MSPs size using the 1RSB cavity method from statistical mechanics of disordered systems. We also propose a heuristic a...
Article
Full-text available
Using a formalism based on the spectral decomposition of the replicated transfer matrix for disordered Ising models, we obtain several results that apply both to isolated one-dimensional systems and to locally tree-like graph and factor graph (p-spin) ensembles. We present exact analytical expressions, which can be efficiently approximated numerica...
Article
Full-text available
We study the finite-size corrections to the free-energy density in disordered spin systems on sparse random graphs, using both replica theory and the cavity method. We derive analytical expressions for the O(1/N) corrections in the replica symmetric phase as a linear combination of the free energies of open and closed chains. We perform a numerical...

Network

Cited By