Publications (72)93.33 Total impact
 [Show abstract] [Hide abstract]
ABSTRACT: We study the connection between the highly nonconvex loss function of a simple model of the fullyconnected feedforward neural network and the Hamiltonian of the spherical spinglass model under the assumptions of: i) variable independence, ii) redundancy in network parametrization, and iii) uniformity. These assumptions enable us to explain the complexity of the fully decoupled neural network through the prism of the results from the random matrix theory. We show that for largesize decoupled networks the lowest critical values of the random loss function are located in a welldefined narrow band lowerbounded by the global minimum. Furthermore, they form a layered structure. We show that the number of local minima outside the narrow band diminishes exponentially with the size of the network. We empirically demonstrate that the mathematical model exhibits similar behavior as the computer simulations, despite the presence of high dependencies in real networks. We conjecture that both simulated annealing and SGD converge to the band containing the largest number of critical points, and that all critical points found there are local minima and correspond to the same high learning quality measured by the test error. This emphasizes a major difference between large and smallsize networks where for the latter poor quality local minima have nonzero probability of being recovered. Simultaneously we prove that recovering the global minimum becomes harder as the network size increases and that it is in practice irrelevant as global minimum often leads to overfitting.11/2014; 
Article: Biased random walks on random graphs
[Show abstract] [Hide abstract]
ABSTRACT: These notes cover one of the topics programmed for the St Petersburg School in Probability and Statistical Physics of June 2012. The aim is to review recent mathematical developments in the field of random walks in random environment. Our main focus will be on directionally transient and reversible random walks on different types of underlying graph structures, such as $\mathbb{Z}$, trees and $\mathbb{Z}^d$ for $d\geq 2$.06/2014;  [Show abstract] [Hide abstract]
ABSTRACT: We take a first small step to extend the validity of RudelsonVershynin type estimates to some sparse random matrices, here random permutation matrices. We give lower (and upper) bounds on the smallest singular value of a large random matrix D+M where M is a random permutation matrix, sampled uniformly, and D is diagonal. When D is itself random with i.i.d terms on the diagonal, we obtain a RudelsonVershynin type estimate, using the classical theory of random walks with negative drift.04/2014;  [Show abstract] [Hide abstract]
ABSTRACT: The speed v(β) of a βbiased random walk on a GaltonWatson tree without leaves is increasing for β ≥ 1160. © 2013 Wiley Periodicals, Inc.Communications on Pure and Applied Mathematics 04/2014; 67(4). · 3.34 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: The paper deals with the number of critical points of Gaussian smooth functions on the N dimensional sphere, and more especially, it tries to characterize a Morse function on a highdimensional sphere, and to determine the number of critical values of a given index, or below a given index. The main result is based on an identity which relates the mean number of critical points of index k with the kth smallest eigenvalue of the Gaussian orthogonal ensemble, and shows that there is an exponentially large number of critical points of given index. The asymptotic complexity of the mean number of critical points is carefully investigated and an explicit formula is derived.The Annals of Probability 11/2013; 41(6). · 1.43 Impact Factor 
Article: Randomly Trapped Random Walks
[Show abstract] [Hide abstract]
ABSTRACT: We introduce a general model of trapping for random walks on graphs. We give the possible scaling limits of these "Randomly Trapped Random Walks" on Z. These scaling limits include the well known Fractional Kinetics process, the FontesIsopiNewman singular diffusion as well as a new broad class we call Spatially Subordinated Brownian Motions. We give sufficient conditions for convergence and illustrate these on two important examples.02/2013; 
Article: Preface: From the Director
Communications on Pure and Applied Mathematics 07/2012; 65(7). · 3.34 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We consider a biased random walk Xn on a Galton–Watson tree with leaves in the subballistic regime. We prove that there exists an explicit constant γ = γ(β) ∈ (0, 1), depending on the bias β, such that Xn is of order nγ. Denoting Δn the hitting time of level n, we prove that Δn/n1/γ is tight. Moreover, we show that Δn/n1/γ does not converge in law (at least for large values of β). We prove that along the sequences nλ(k) = ⌊λβγk⌋, Δn/n1/γ converges to certain infinitely divisible laws. Key tools for the proof are the classical Harris decomposition for Galton–Watson trees, a new variant of regeneration times and the careful analysis of triangular arrays of i.i.d. heavytailed random variables.The Annals of Probability 01/2012; 40(1). · 1.43 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We consider random hopping time (RHT) dynamics of the SherringtonKirkpatrick (SK) model and pspin models of spin glasses. For any of these models and for any inverse temperature β > 0 we prove that, on time scales that are subexponential in the dimension, the properly scaled clock process (timechange process) of the dynamics converges to an extremal process. Moreover, on these time scales, the system exhibits aginglike behavior, which we call extremal aging. In other words, the dynamics of these models ages as the random energy model (REM) does. Hence, by extension, this confirms Bouchaud's REMlike trap model as a universal aging mechanism for a wide range of systems that, for the first time, includes the SK model. © 2011 Wiley Periodicals, Inc.Communications on Pure and Applied Mathematics 12/2011; 65(1):77  127. · 3.34 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: The speed $v(\beta)$ of a $\beta$biased random walk on a GaltonWatson tree without leaves is increasing for $\beta \geq 717$.11/2011;  [Show abstract] [Hide abstract]
ABSTRACT: We analyze the landscape of general smooth Gaussian functions on the sphere in dimension N, when N is large. We give an explicit formula for the asymptotic complexity of the mean number of critical points of finite and diverging index at any level of energy and for the mean Euler characteristic of level sets. We then find two possible scenarios for the bottom energy landscape, one that has a layered structure of critical values and a strong correlation between indexes and critical values and another where even at energy levels below the limiting ground state energy the mean number of local minima is exponentially large. These two scenarios should correspond to the distinction between onestep replica symmetry breaking and full replicasymmetric breaking of the physics literature on spin glasses. In the former, we find a new way to derive the asymptotic complexity function as a function of the 1RSB Parisi functional.10/2011;  [Show abstract] [Hide abstract]
ABSTRACT: We prove the Einstein relation, relating the velocity under a small perturbation to the diffusivity in equilibrium, for certain biased random walks on GaltonWatson trees. This provides the first example where the Einstein relation is proved for motion in random media with arbitrary deep traps.Annales de l Institut Henri Poincaré Probabilités et Statistiques 06/2011; · 0.97 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: Smooth linear statistics of random permutation matrices, sampled under a general Ewens distribution, exhibit an interesting nonuniversality phenomenon. Though they have bounded variance, their fluctuations are asymptotically nonGaussian but infinitely divisible. The fluctuations are asymptotically Gaussian for less smooth linear statistics for which the variance diverges. The degree of smoothness is measured in terms of the quality of the trapezoidal approximations of the integral of the observable.06/2011;  [Show abstract] [Hide abstract]
ABSTRACT: As a model of trapping by biased motion in random structure, we study the time taken for a biased random walk to return to the root of a subcritical GaltonWatson tree. We do so for trees in which these biases are randomly chosen, independently for distinct edges, according to a law that satisfies a logarithmic nonlattice condition. The mean return time of the walk is in essence given by the total conductance of the tree. We determine the asymptotic decay of this total conductance, finding it to have a pure powerlaw decay. In the case of the conductance associated to a single vertex at maximal depth in the tree, this asymptotic decay may be analysed by the classical defective renewal theorem, due to the nonlattice edgebias assumption. However, the derivation of the decay for total conductance requires computing an additional constant multiple outside the powerlaw that allows for the contribution of all vertices close to the base of the tree. This computation entails a detailed study of a convenient decomposition of the tree, under conditioning on the tree having high total conductance. As such, our principal conclusion may be viewed as a development of renewal theory in the context of random environments. For randomly biased random walk on a supercritical GaltonWatson tree with positive extinction probability, our main results may be regarded as a description of the slowdown mechanism caused by the presence of subcritical trees adjacent to the backbone that may act as traps that detain the walker. Indeed, this conclusion is exploited in \cite{GerardAlan} to obtain a stable limiting law for walker displacement in such a tree.Communications on Pure and Applied Mathematics 01/2011; · 3.34 Impact Factor 
Article: Universality and extremal aging for dynamics of spin glasses on subexponential time scales
[Show abstract] [Hide abstract]
ABSTRACT: We consider Random Hopping Time (RHT) dynamics of the Sherrington  Kirkpatrick (SK) model and pspin models of spin glasses. For any of these models and for any inverse temperature we prove that, on time scales that are subexponential in the dimension, the properly scaled clock process (timechange process) of the dynamics converges to an extremal process. Moreover, on these time scales, the system exhibits aging like behavior which we called extremal aging. In other words, the dynamics of these models ages as the random energy model (REM) does. Hence, by extension, this confirms Bouchaud's REMlike trap model as a universal aging mechanism for a wide range of systems which, for the first time, includes the SK model.10/2010;  [Show abstract] [Hide abstract]
ABSTRACT: This paper studies the extreme gaps between eigenvalues of random matrices. We give the joint limiting law of the smallest gaps for Haardistributed unitary matrices and matrices from the Gaussian unitary ensemble. In particular, the kth smallest gap, normalized by a factor $n^{4/3}$, has a limiting density proportional to $x^{3k1}e^{x^3}$. Concerning the largest gaps, normalized by $n/\sqrt{\log n}$, they converge in ${\mathrm{L}}^p$ to a constant for all $p>0$. These results are compared with the extreme gaps between zeros of the Riemann zeta function.The Annals of Probability 10/2010; · 1.43 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We give an asymptotic evaluation of the complexity of spherical pspin spinglass models via random matrix theory. This study enables us to obtain detailed information about the bottom of the energy landscape, including the absolute minimum (the ground state), the other local minima, and describe an interesting layered structure of the low critical values for the Hamiltonians of these models. We also show that our approach allows us to compute the related TAPcomplexity and extend the results known in the physics literature. As an independent tool, we prove a LDP for the kth largest eigenvalue of the GOE, extending the results of Ben Arous, Dembo and Guionnett (2001).Communications on Pure and Applied Mathematics 03/2010; · 3.34 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We consider the family of twosided Bernoulli initial conditions for TASEP which, as the left and right densities ($\rho_,\rho_+$) are varied, give rise to shock waves and rarefaction fansthe two phenomena which are typical to TASEP. We provide a proof of Conjecture 7.1 of [Progr. Probab. 51 (2002) 185204] which characterizes the order of and scaling functions for the fluctuations of the height function of twosided TASEP in terms of the two densities $\rho_,\rho_+$ and the speed $y$ around which the height is observed. In proving this theorem for TASEP, we also prove a fluctuation theorem for a class of corner growth processes with external sources, or equivalently for the last passage time in a directed last passage percolation model with twosided boundary conditions: $\rho_$ and $1\rho_+$. We provide a complete characterization of the order of and the scaling functions for the fluctuations of this model's last passage time $L(N,M)$ as a function of three parameters: the two boundary/source rates $\rho_$ and $1\rho_+$, and the scaling ratio $\gamma^2=M/N$. The proof of this theorem draws on the results of [Comm. Math. Phys. 265 (2006) 144] and extensively on the work of [Ann. Probab. 33 (2005) 16431697] on finite rank perturbations of Wishart ensembles in random matrix theory.The Annals of Probability 05/2009; 39(2011). · 1.43 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We continue here the study of free extreme values begun in Ben Arous and Voiculescu (2006). We study the convergence of the free point processes associated with free extreme values to a free Poisson random measure (Voiculescu (1998), BarndorffNielsen and Thorbjornsen (2005)). We relate this convergence to the free extremal laws introduced in Ben Arous and Voiculescu (2006) and give the limit laws for free order statistics.Probability Theory and Related Fields 04/2009; · 1.46 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We survey in this paper a universality phenomenon which shows that some characteristics of complex random energy landscapes are modelindependent, or universal. This universality, called REMuniversality, was discovered by S. Mertens and H. Bauke in the context of combinatorial optimization. We survey recent advances on the extent of this REMuniversality for equilibrium as well as dynamical properties of spin glasses. We also focus on the limits of REMuniversality, i.e., when it ceases to be valid. Mathematics Subject Classification (2000)82B4482D3082C4460G1560G55 KeywordsSpin glassesrandom energy modelextreme valuesGaussian processesstatistical mechanicsdisordered media12/2008: pages 4584;
Publication Stats
1k  Citations  
93.33  Total Impact Points  
Top Journals
Institutions

2008

CUNY Graduate Center
New York City, New York, United States


2006

Université ParisSud 11
Orsay, ÎledeFrance, France


1998–2005

École Polytechnique Fédérale de Lausanne
Lausanne, Vaud, Switzerland


2000

Ecole polytechnique fédérale de Lausanne
Lausanne, Vaud, Switzerland


1999

Technion  Israel Institute of Technology
 Electrical Engineering Group
Haifa, Haifa District, Israel


1995–1997

Ecole Normale Supérieure de Paris
Lutetia Parisorum, ÎledeFrance, France
