Publications (78)100.75 Total impact
 [Show abstract] [Hide abstract]
ABSTRACT: We study the connection between the highly nonconvex loss function of a simple model of the fullyconnected feedforward neural network and the Hamiltonian of the spherical spinglass model under the assumptions of: i) variable independence, ii) redundancy in network parametrization, and iii) uniformity. These assumptions enable us to explain the complexity of the fully decoupled neural network through the prism of the results from the random matrix theory. We show that for largesize decoupled networks the lowest critical values of the random loss function are located in a welldefined narrow band lowerbounded by the global minimum. Furthermore, they form a layered structure. We show that the number of local minima outside the narrow band diminishes exponentially with the size of the network. We empirically demonstrate that the mathematical model exhibits similar behavior as the computer simulations, despite the presence of high dependencies in real networks. We conjecture that both simulated annealing and SGD converge to the band containing the largest number of critical points, and that all critical points found there are local minima and correspond to the same high learning quality measured by the test error. This emphasizes a major difference between large and smallsize networks where for the latter poor quality local minima have nonzero probability of being recovered. Simultaneously we prove that recovering the global minimum becomes harder as the network size increases and that it is in practice irrelevant as global minimum often leads to overfitting. 
Article: Biased random walks on random graphs
[Show abstract] [Hide abstract]
ABSTRACT: These notes cover one of the topics programmed for the St Petersburg School in Probability and Statistical Physics of June 2012. The aim is to review recent mathematical developments in the field of random walks in random environment. Our main focus will be on directionally transient and reversible random walks on different types of underlying graph structures, such as $\mathbb{Z}$, trees and $\mathbb{Z}^d$ for $d\geq 2$.  [Show abstract] [Hide abstract]
ABSTRACT: We take a first small step to extend the validity of RudelsonVershynin type estimates to some sparse random matrices, here random permutation matrices. We give lower (and upper) bounds on the smallest singular value of a large random matrix D+M where M is a random permutation matrix, sampled uniformly, and D is diagonal. When D is itself random with i.i.d terms on the diagonal, we obtain a RudelsonVershynin type estimate, using the classical theory of random walks with negative drift.  [Show abstract] [Hide abstract]
ABSTRACT: The speed v(β) of a βbiased random walk on a GaltonWatson tree without leaves is increasing for β ≥ 1160. © 2013 Wiley Periodicals, Inc.Communications on Pure and Applied Mathematics 04/2014; 67(4). DOI:10.1002/cpa.21505 · 3.13 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We analyze the landscape of general smooth Gaussian functions on the sphere in dimension N, when N is large. We give an explicit formula for the asymptotic complexity of the mean number of critical points of finite and diverging index at any level of energy and for the mean Euler characteristic of level sets. We then find two possible scenarios for the bottom landscape, one that has a layered structure of critical values and a strong correlation between indexes and critical values and another where even at levels below the limiting ground state energy the mean number of local minima is exponentially large. We end the paper by discussing how these results can be interpreted in the language of spin glasses models.The Annals of Probability 11/2013; 41(6). DOI:10.1214/13AOP862 · 1.42 Impact Factor 
Article: Randomly Trapped Random Walks
[Show abstract] [Hide abstract]
ABSTRACT: We introduce a general model of trapping for random walks on graphs. We give the possible scaling limits of these "Randomly Trapped Random Walks" on Z. These scaling limits include the well known Fractional Kinetics process, the FontesIsopiNewman singular diffusion as well as a new broad class we call Spatially Subordinated Brownian Motions. We give sufficient conditions for convergence and illustrate these on two important examples.The Annals of Probability 02/2013; 43(5). DOI:10.1214/14AOP939 · 1.42 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We give an asymptotic evaluation of the complexity of spherical pspin spinglass models via random matrix theory. This study enables us to obtain detailed information about the bottom of the energy landscape, including the absolute minimum (the ground state), the other local minima, and describe an interesting layered structure of the low critical values for the Hamiltonians of these models. We also show that our approach allows us to compute the related TAPcomplexity and extend the results known in the physics literature. As an independent tool, we prove a LDP for the kth largest eigenvalue of the GOE, extending the results of Ben Arous, Dembo and Guionnett (2001).Communications on Pure and Applied Mathematics 02/2013; 66(2). DOI:10.1002/cpa.21422 · 3.13 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: As a model of trapping by biased motion in random structure, we study the time taken for a biased random walk to return to the root of a subcritical GaltonWatson tree. We do so for trees in which these biases are randomly chosen, independently for distinct edges, according to a law that satisfies a logarithmic nonlattice condition. The mean return time of the walk is in essence given by the total conductance of the tree. We determine the asymptotic decay of this total conductance, finding it to have a pure powerlaw decay. In the case of the conductance associated to a single vertex at maximal depth in the tree, this asymptotic decay may be analysed by the classical defective renewal theorem, due to the nonlattice edgebias assumption. However, the derivation of the decay for total conductance requires computing an additional constant multiple outside the powerlaw that allows for the contribution of all vertices close to the base of the tree. This computation entails a detailed study of a convenient decomposition of the tree, under conditioning on the tree having high total conductance. As such, our principal conclusion may be viewed as a development of renewal theory in the context of random environments. For randomly biased random walk on a supercritical GaltonWatson tree with positive extinction probability, our main results may be regarded as a description of the slowdown mechanism caused by the presence of subcritical trees adjacent to the backbone that may act as traps that detain the walker. Indeed, this conclusion is exploited in \cite{GerardAlan} to obtain a stable limiting law for walker displacement in such a tree.Communications on Pure and Applied Mathematics 11/2012; 65(11). DOI:10.1002/cpa.21416 · 3.13 Impact Factor 
Article: Preface: From the Director
[Show abstract] [Hide abstract]
ABSTRACT: No abstract is available for this article.Communications on Pure and Applied Mathematics 07/2012; 65(7). DOI:10.1002/cpa.21400 · 3.13 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We consider random hopping time (RHT) dynamics of the SherringtonKirkpatrick (SK) model and pspin models of spin glasses. For any of these models and for any inverse temperature β > 0 we prove that, on time scales that are subexponential in the dimension, the properly scaled clock process (timechange process) of the dynamics converges to an extremal process. Moreover, on these time scales, the system exhibits aginglike behavior, which we call extremal aging. In other words, the dynamics of these models ages as the random energy model (REM) does. Hence, by extension, this confirms Bouchaud's REMlike trap model as a universal aging mechanism for a wide range of systems that, for the first time, includes the SK model. © 2011 Wiley Periodicals, Inc.Communications on Pure and Applied Mathematics 01/2012; 65(1):77  127. DOI:10.1002/cpa.20372 · 3.13 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We consider a biased random walk Xn on a Galton–Watson tree with leaves in the subballistic regime. We prove that there exists an explicit constant γ = γ(β) ∈ (0, 1), depending on the bias β, such that Xn is of order nγ. Denoting Δn the hitting time of level n, we prove that Δn/n1/γ is tight. Moreover, we show that Δn/n1/γ does not converge in law (at least for large values of β). We prove that along the sequences nλ(k) = ⌊λβγk⌋, Δn/n1/γ converges to certain infinitely divisible laws. Key tools for the proof are the classical Harris decomposition for Galton–Watson trees, a new variant of regeneration times and the careful analysis of triangular arrays of i.i.d. heavytailed random variables.The Annals of Probability 01/2012; 40(1). DOI:10.1214/10AOP620 · 1.42 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We study the many body quantum evolution of bosonic systems in the mean field limit. The dynamics is known to be well approximated by the Hartree equation. So far, the available results have the form of a law of large numbers. In this paper we go one step further and we show that the fluctuations around the Hartree evolution satisfy a central limit theorem. Interestingly, the variance of the limiting Gaussian distribution is determined by a timedependent Bogoliubov transformation describing the dynamics of initial coherent states in a Fock space representation of the system.Communications in Mathematical Physics 11/2011; 321(2). DOI:10.1007/s0022001317221 · 2.09 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: The speed $v(\beta)$ of a $\beta$biased random walk on a GaltonWatson tree without leaves is increasing for $\beta \geq 717$.  [Show abstract] [Hide abstract]
ABSTRACT: We analyze the landscape of general smooth Gaussian functions on the sphere in dimension N, when N is large. We give an explicit formula for the asymptotic complexity of the mean number of critical points of finite and diverging index at any level of energy and for the mean Euler characteristic of level sets. We then find two possible scenarios for the bottom energy landscape, one that has a layered structure of critical values and a strong correlation between indexes and critical values and another where even at energy levels below the limiting ground state energy the mean number of local minima is exponentially large. These two scenarios should correspond to the distinction between onestep replica symmetry breaking and full replicasymmetric breaking of the physics literature on spin glasses. In the former, we find a new way to derive the asymptotic complexity function as a function of the 1RSB Parisi functional.  [Show abstract] [Hide abstract]
ABSTRACT: We prove the Einstein relation, relating the velocity under a small perturbation to the diffusivity in equilibrium, for certain biased random walks on GaltonWatson trees. This provides the first example where the Einstein relation is proved for motion in random media with arbitrary deep traps.Annales de l Institut Henri Poincaré Probabilités et Statistiques 06/2011; 49(3). DOI:10.1214/12AIHP486 · 1.06 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: Smooth linear statistics of random permutation matrices, sampled under a general Ewens distribution, exhibit an interesting nonuniversality phenomenon. Though they have bounded variance, their fluctuations are asymptotically nonGaussian but infinitely divisible. The fluctuations are asymptotically Gaussian for less smooth linear statistics for which the variance diverges. The degree of smoothness is measured in terms of the quality of the trapezoidal approximations of the integral of the observable.Annales de l Institut Henri Poincaré Probabilités et Statistiques 06/2011; 51(2). DOI:10.1214/13AIHP569 · 1.06 Impact Factor 
Article: Wigner matrices
[Show abstract] [Hide abstract]
ABSTRACT: This is a brief survey of some of the important results in the study of the eigenvalues and the eigenvectors of Wigner random matrices, i.e. random her mitian (or real symmetric) matrices with i.i.d entries. We review briey the known universality results, which show how much the behavior of the spectrum is insensitive to the distribution of the entries. 
Article: Universality and extremal aging for dynamics of spin glasses on subexponential time scales
[Show abstract] [Hide abstract]
ABSTRACT: We consider Random Hopping Time (RHT) dynamics of the Sherrington  Kirkpatrick (SK) model and pspin models of spin glasses. For any of these models and for any inverse temperature we prove that, on time scales that are subexponential in the dimension, the properly scaled clock process (timechange process) of the dynamics converges to an extremal process. Moreover, on these time scales, the system exhibits aging like behavior which we called extremal aging. In other words, the dynamics of these models ages as the random energy model (REM) does. Hence, by extension, this confirms Bouchaud's REMlike trap model as a universal aging mechanism for a wide range of systems which, for the first time, includes the SK model.  [Show abstract] [Hide abstract]
ABSTRACT: This paper studies the extreme gaps between eigenvalues of random matrices. We give the joint limiting law of the smallest gaps for Haardistributed unitary matrices and matrices from the Gaussian unitary ensemble. In particular, the kth smallest gap, normalized by a factor $n^{4/3}$, has a limiting density proportional to $x^{3k1}e^{x^3}$. Concerning the largest gaps, normalized by $n/\sqrt{\log n}$, they converge in ${\mathrm{L}}^p$ to a constant for all $p>0$. These results are compared with the extreme gaps between zeros of the Riemann zeta function.The Annals of Probability 10/2010; 41(4). DOI:10.1214/11AOP710 · 1.42 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We report here on the recent works [3] and [4]. There we consider a model of diffusion in random media with a twoway coupling (i.e. a model in which the randomness of the medium influences the diffusing particles, and where the diffusing particles change the medium). In this particular model, particles are injected at the origin with a timedependent rate, and diffuse among random traps. Each trap has a finite (random) depth, so that when it has absorbed a finite (random) number of particles it is "saturated", and it no longer acts as a trap. Related models have been studied recently by Gravner and Quastel [10] and by Funaki [9] using hydrodinamic limit tools. We compute the asymptotic behaviour of the probability of survival of a particle born at some given time, both in the annealed and quenched cases, and show that three different situations occur depending on the injection rate. For weak injection, the typical survival strategy of the particle is as in Sznitman [16] and the asymptotic behaviour of this survival probability behaves as if there was no saturation effect. For medium injection rate, the picture is closer to that of Internal DLA, as given by Lawler, Bramson and Griffeath [13]. For large injection rates, the picture is less understood except in dimension one.
Publication Stats
2k  Citations  
100.75  Total Impact Points  
Top Journals
Institutions

20082014

NYU Langone Medical Center
New York, New York, United States


20072011

CUNY Graduate Center
New York City, New York, United States


20032009

Mathematical Sciences Research Institute
Berkeley, California, United States


2005

University of North Carolina at Charlotte
 Department of Mathematics & Statistics
Charlotte, North Carolina, United States


19982005

École Polytechnique Fédérale de Lausanne
Lausanne, Vaud, Switzerland


2000

Ecole polytechnique fédérale de Lausanne
Lausanne, Vaud, Switzerland


1997

Ecole Normale Supérieure de Paris
Lutetia Parisorum, ÎledeFrance, France


19931994

Université ParisSud 11
Orsay, ÎledeFrance, France
