Weinan Ee

Weinan Ee
Princeton University | PU · Department of Mathematics

About

335
Publications
73,387
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
20,884
Citations

Publications

Publications (335)
Preprint
One of the oldest and most studied subject in scientific computing is algorithms for solving partial differential equations (PDEs). A long list of numerical methods have been proposed and successfully used for various applications. In recent years, deep learning methods have shown their superiority for high-dimensional PDEs where traditional method...
Preprint
Full-text available
We report an ab initio multi-scale study of lead titanate using the Deep Potential (DP) models, a family of machine learning-based atomistic models, trained on first-principles density functional theory data, to represent potential and polarization surfaces. Our approach includes anharmonic effects beyond the limitations of reduced models and of th...
Preprint
Collisions are common in many dynamical systems with real applications. They can be formulated as hybrid dynamical systems with discontinuities automatically triggered when states transverse certain manifolds. We present an algorithm for the optimal control problem of such hybrid dynamical systems, based on solving the equations derived from the hy...
Preprint
Solving complex optimal control problems have confronted computation challenges for a long time. Recent advances in machine learning have provided us with new opportunities to address these challenges. This paper takes the model predictive control, a popular optimal control method, as the primary example to survey recent progress that leverages mac...
Article
Full-text available
To fill the gap between accurate (and expensive) ab initio calculations and efficient atomistic simulations based on empirical interatomic potentials, a new class of descriptions of atomic interactions has emerged and been widely applied; i.e., machine learning potentials (MLPs). One recently developed type of MLP is the Deep Potential (DP) method....
Article
Full-text available
We study the natural function space for infinitely wide two-layer neural networks with ReLU activation (Barron space) and establish different representation formulae. In two cases, we describe the space explicitly up to isomorphism. Using a convenient representation, we study the pointwise properties of two-layer networks and show that functions wh...
Article
Machine learning models for the potential energy of multi-atomic systems, such as the deep potential (DP) model, make molecular simulations with the accuracy of quantum mechanical density functional theory possible at a cost only moderately higher than that of empirical force fields. However, the majority of these models lack explicit long-range in...
Preprint
We propose a machine learning enhanced algorithm for solving the optimal landing problem. Using Pontryagin's minimum principle, we derive a two-point boundary value problem for the landing problem. The proposed algorithm uses deep learning to predict the optimal landing time and a space-marching technique to provide good initial guesses for the bou...
Article
Full-text available
The generative adversarial network (GAN) is a well-known model for learning high-dimensional distributions, but the mechanism for its generalization ability is not understood. In particular, GAN is vulnerable to the memorization phenomenon, the eventual convergence to the empirical distribution. We consider a simplified GAN model with the generator...
Preprint
Full-text available
To fill the gap between accurate (and expensive) ab initio calculations and efficient atomistic simulations based on empirical interatomic potentials, a new class of descriptions of atomic interactions has emerged and been widely applied; i.e., machine learning potentials (MLPs). One recently developed type of MLP is the Deep Potential (DP) method....
Article
Full-text available
One of the key issues in the analysis of machine learning models is to identify the appropriate function space and norm for the model. This is the set of functions endowed with a quantity which can control the approximation and estimation errors by a particular machine learning model. In this paper, we address this issue for two representative neur...
Article
Full-text available
In recent years, tremendous progress has been made on numerical algorithms for solving partial differential equations (PDEs) in a very high dimension, using ideas from either nonlinear (multilevel) Monte Carlo or deep learning. They are potentially free of the curse of dimensionality for many different applications and have been proven to be so in...
Article
Full-text available
Enhanced sampling methods such as metadynamics and umbrella sampling have become essential tools for exploring the configuration space of molecules and materials. At the same time, they have long faced a number of issues such as the inefficiency when dealing with a large number of collective variables (CVs) or systems with high free energy barriers...
Preprint
Full-text available
A long standing problem in the modeling of non-Newtonian hydrodynamics is the availability of reliable and interpretable hydrodynamic models that faithfully encode the underlying micro-scale polymer dynamics. The main complication arises from the long polymer relaxation time, the complex molecular structure, and heterogeneous interaction. DeePN$^2$...
Preprint
We propose an efficient, reliable, and interpretable global solution method, $\textit{Deep learning-based algorithm for Heterogeneous Agent Models, DeepHAM}$, for solving high dimensional heterogeneous agent models with aggregate shocks. The state distribution is approximately represented by a set of optimal generalized moments. Deep neural network...
Preprint
Full-text available
Machine learning models for the potential energy of multi-atomic systems, such as the deep potential (DP) model, make possible molecular simulations with the accuracy of quantum mechanical density functional theory, at a cost only moderately higher than that of empirical force fields. However, the majority of these models lack explicit long-range i...
Article
Full-text available
We introduce a new family of numerical algorithms for approximating solutions of general high-dimensional semilinear parabolic partial differential equations at single space-time points. The algorithm is obtained through a delicate combination of the Feynman–Kac and the Bismut–Elworthy–Li formulas, and an approximate decomposition of the Picard fix...
Article
We propose a systematic method for learning stable and physically interpretable dynamical models using sampled trajectory data from physical processes based on a generalized Onsager principle. The learned dynamics are autonomous ordinary differential equations parametrized by neural networks that retain clear physical structure information, such as...
Preprint
Full-text available
The generative adversarial network (GAN) is a well-known model for learning high-dimensional distributions, but the mechanism for its generalization ability is not understood. In particular, GAN is vulnerable to the memorization phenomenon, the eventual convergence to the empirical distribution. We consider a simplified GAN model with the generator...
Article
Full-text available
Using the Deep Potential methodology, we construct a model that reproduces accurately the potential energy surface of the SCAN approximation of density functional theory for water, from low temperature and pressure to about 2400 K and 50 GPa, excluding the vapor stability region. The computational efficiency of the model makes it possible to predic...
Preprint
Full-text available
Reinforcement learning (RL) algorithms based on high-dimensional function approximation have achieved tremendous empirical success in large-scale problems with an enormous number of states. However, most analysis of such algorithms gives rise to error bounds that involve either the number of states or the number of features. This paper considers th...
Preprint
Full-text available
Enhanced sampling methods such as metadynamics and umbrella sampling have become essential tools for exploring the configuration space of molecules and materials. At the same time, they have long faced the following dilemma: Since they are only effective with a small number of collective variables (CVs), choosing a proper set of CVs becomes critica...
Preprint
We propose a unified framework that extends the inference methods for classical hidden Markov models to continuous settings, where both the hidden states and observations occur in continuous time. Two different settings are analyzed: (1) hidden jump process with a finite state space; (2) hidden diffusion process with a continuous state space. For e...
Article
Solid-state electrolyte materials with superior lithium ionic conductivities are vital to the next-generation Li-ion batteries. Molecular dynamics could provide atomic scale information to understand the diffusion process of Li-ion in these superionic conductor materials. Here, we implement the deep potential generator to set up an efficient protoc...
Article
Full-text available
We establish a scale separation of Kolmogorov width type between subspaces of a given Banach space under the condition that a sequence of linear maps converges much faster on one of the subspaces. The general technique is then applied to show that reproducing kernel Hilbert spaces are poor L2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepa...
Preprint
Full-text available
Using the Deep Potential methodology, we construct a model that reproduces accurately the potential energy surface of the SCAN approximation of density functional theory for water, from low temperature and pressure to about 2400 K and 50 GPa, excluding the vapor stability region. The computational efficiency of the model makes it possible to predic...
Article
We present the GPU version of DeePMD-kit, which, upon training a deep neural network model using ab initio data, can drive extremely large-scale molecular dynamics (MD) simulation with ab initio accuracy. Our tests show that for a water system of 12,582,912 atoms, the GPU version can be 7 times faster than the CPU version under the same power consu...
Article
We propose the coarse-grained spectral projection method (CGSP), a deep learning assisted approach for tackling quantum unitary dynamic problems with an emphasis on quench dynamics. We show that CGSP can extract spectral components of many-body quantum states systematically with a sophisticated neural network quantum ansatz. CGSP fully exploits the...
Preprint
Full-text available
We introduce DeePKS-kit, an open-source software package for developing machine learning based energy and density functional models. DeePKS-kit is interfaced with PyTorch, an open-source machine learning library, and PySCF, an ab initio computational chemistry program that provides simple and customized tools for developing quantum chemistry codes....
Preprint
A recent numerical study observed that neural network classifiers enjoy a large degree of symmetry in the penultimate layer. Namely, if $h(x) = Af(x) +b$ where $A$ is a linear map and $f$ is the output of the penultimate layer of the network (after activation), then all data points $x_{i, 1}, \dots, x_{i, N_i}$ in a class $C_i$ are mapped to a sing...
Article
We propose a general machine learning-based framework for building an accurate and widely applicable energy functional within the framework of generalized Kohn-Sham density functional theory. To this end, we develop a way of training self-consistent models that are capable of taking large datasets from different systems and different kinds of label...
Preprint
We use explicit representation formulas to show that solutions to certain partial differential equations can be represented efficiently using artificial neural networks, even in high dimension. Conversely, we present examples in which the solution fails to lie in the function space associated to a neural network under consideration.
Preprint
Full-text available
Models for learning probability distributions such as generative models and density estimators behave quite differently from models for learning functions. One example is found in the memorization phenomenon, namely the ultimate convergence to the empirical distribution, that occurs in generative adversarial networks (GANs). For this reason, the is...
Article
We introduce a machine-learning-based framework for constructing continuum a non-Newtonian fluid dynamics model directly from a microscale description. Dumbbell polymer solutions are used as examples to demonstrate the essential ideas. To faithfully retain molecular fidelity, we establish a micro-macro correspondence via a set of encoders for the m...
Preprint
It is not clear yet why ADAM-alike adaptive gradient algorithms suffer from worse generalization performance than SGD despite their faster training speed. This work aims to provide understandings on this generalization gap by analyzing their local convergence behaviors. Specifically, we observe the heavy tails of gradient noise in these algorithms....
Article
We prove that the gradient descent training of a two-layer neural network on empirical or population risk may not decrease population risk at an order faster than t <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">-4</sup> /(d <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w...
Preprint
We consider binary and multi-class classification problems using hypothesis classes of neural networks. For a given hypothesis class, we use Rademacher complexity estimates and direct approximation theorems to obtain a priori error estimates for regularized loss functionals.
Preprint
Neural network-based machine learning is capable of approximating functions in very high dimension with unprecedented efficiency and accuracy. This has opened up many exciting new possibilities, not just in traditional areas of artificial intelligence, but also in scientific computing and computational science. At the same time, machine learning ha...
Preprint
Full-text available
The purpose of this article is to review the achievements made in the last few years towards the understanding of the reasons behind the success and subtleties of neural network-based machine learning. In the tradition of good old applied mathematics, we will not only give attention to rigorous mathematical results, but also the insight we have gai...
Preprint
Full-text available
We study the approximation properties and optimization dynamics of recurrent neural networks (RNNs) when applied to learn input-output relationships in temporal data. We consider the simple but representative setting of using continuous-time linear RNNs to learn from data generated by linear relationships. Mathematically, the latter can be understo...
Preprint
The dynamic behavior of RMSprop and Adam algorithms is studied through a combination of careful numerical experiments and theoretical explanations. Three types of qualitative features are observed in the training loss curve: fast initial convergence, oscillations and large spikes. The sign gradient descent (signGD) algorithm, which is the limit of...
Article
We characterize the meaning of words with language-independent numerical fingerprints, through a mathematical analysis of recurring patterns in texts. Approximating texts by Markov processes on a long-range time scale, we are able to extract topics, discover synonyms, and sketch semantic fields from a particular document of moderate length, without...
Preprint
Full-text available
We propose a systematic method for learning stable and interpretable dynamical models using sampled trajectory data from physical processes based on a generalized Onsager principle. The learned dynamics are autonomous ordinary differential equations parameterized by neural networks that retain clear physical structure information, such as free ener...
Article
Full-text available
We present a continuous formulation of machine learning, as a problem in the calculus of variations and differential-integral equations, in the spirit of classical numerical analysis. We demonstrate that conventional machine learning models and algorithms, such as the random feature model, the two-layer neural network model and the residual neural...
Preprint
Full-text available
In recent years, tremendous progress has been made on numerical algorithms for solving partial differential equations (PDEs) in a very high dimension, using ideas from either nonlinear (multilevel) Monte Carlo or deep learning. They are potentially free of the curse of dimensionality for many different applications and have been proven to be so in...
Preprint
The random feature model exhibits a kind of resonance behavior when the number of parameters is close to the training sample size. This behavior is characterized by the appearance of large generalization gap, and is due to the occurrence of very small eigenvalues for the associated Gram matrix. In this paper, we examine the dynamic behavior of the...
Article
We introduce the Deep Post–Hartree–Fock (DeePHF) method, a machine learning based scheme for constructing accurate and transferable models for the ground-state energy of electronic structure problems. DeePHF predicts the energy difference between results of highly accurate models such as the coupled cluster method and low accuracy models such as the...
Preprint
Full-text available
We propose a general machine learning-based framework for building an accurate and widely-applicable energy functional within the framework of generalized Kohn-Sham density functional theory. To this end, we develop a way of training self-consistent models that are capable of taking large datasets from different systems and different kinds of label...
Preprint
We develop Banach spaces for ReLU neural networks of finite depth $L$ and infinite width. The spaces contain all finite fully connected $L$-layer networks and their $L^2$-limiting objects under bounds on the natural path-norm. Under this norm, the unit ball in the space for $L$-layer networks has low Rademacher complexity and thus favorable general...
Article
We introduce a deep neural network to model in a symmetry preserving way the environmental dependence of the centers of the electronic charge. The model learns from ab initio density functional theory, wherein the electronic centers are uniquely assigned by the maximally localized Wannier functions. When combined with the deep potential model of th...
Preprint
We propose the coarse-grained spectral projection method (CGSP), a deep learning approach for tackling quantum unitary dynamic problems with an emphasis on quench dynamics. We show CGSP can extract spectral components of many-body quantum states systematically with highly entangled neural network quantum ansatz. CGSP exploits fully the linear unita...
Preprint
A numerical and phenomenological study of the gradient descent (GD) algorithm for training two-layer neural network models is carried out for different parameter regimes when the target function can be accurately approximated by a relatively small number of neurons. It is found that for Xavier-like initialization, there are two distinctive phases i...
Preprint
We study the natural function space for infinitely wide two-layer neural networks and establish different representation formulae. In two cases, we describe the space explicitly up to isomorphism. Using a convenient representation, we study the pointwise properties of two-layer networks and show that functions whose singular set is fractal or curve...
Preprint
Full-text available
It has been a challenge to accurately simulate Li-ion diffusion processes in battery materials at room temperature using {\it ab initio} molecular dynamics (AIMD) due to its high computational cost. This situation has changed drastically in recent years due to the advances in machine learning-based interatomic potentials. Here we implement the Deep...
Preprint
Full-text available
Machine learning is poised as a very powerful tool that can drastically improve our ability to carry out scientific research. However, many issues need to be addressed before this becomes a reality. This article focuses on one particular issue of broad interest: How can we integrate machine learning with physics-based modeling to develop new interp...
Preprint
We prove that the gradient descent training of a two-layer neural network on empirical or population risk may not decrease population risk at an order faster than $t^{-4/(d-2)}$ under mean field scaling. Thus gradient descent training for fitting reasonably smooth, but truly high-dimensional data may be subject to the curse of dimensionality. We pr...
Preprint
We establish a scale separation of Kolmogorov width type between subspaces of a given Banach space under the condition that a sequence of linear maps converges much faster on one of the subspaces. The general technique is then applied to show that reproducing kernel Hilbert spaces are poor $L^2$-approximators for the class of two-layer neural netwo...
Article
Spatial artificial neural network (ANN) models are developed for subgrid-scale (SGS) forces in the large eddy simulation (LES) of turbulence. The input features are based on the first-order derivatives of the filtered velocity field at different spatial locations. The correlation coefficients of SGS forces predicted by the spatial artifical neural...
Preprint
Full-text available
For 35 years, {\it ab initio} molecular dynamics (AIMD) has been the method of choice for understanding complex materials and molecules at the atomic scale from first principles. However, most applications of AIMD are limited to systems with thousands of atoms due to the high computational complexity. We report that a machine learning-based molecul...
Preprint
Full-text available
We introduce the Deep Post-Hartree-Fock (DeePHF) method, a machine learning based scheme for constructing accurate and transferable models for the ground-state energy of electronic structure problems. DeePHF predicts the energy difference between results of highly accurate models such as the coupled cluster method and low accuracy models such as th...
Preprint
Full-text available
We present the GPU version of DeePMD-kit, which, upon training a deep neural network model using ab initio data, can drive extremely large-scale molecular dynamics (MD) simulation with ab initio accuracy. Our tests show that the GPU version is 7 times faster than the CPU version with the same power consumption. The code can scale up to the entire S...
Preprint
We introduce a machine-learning-based framework for constructing continuum non-Newtonian fluid dynamics model directly from a micro-scale description. Polymer solution is used as an example to demonstrate the essential ideas. To faithfully retain molecular fidelity, we establish a micro-macro correspondence via a set of encoders for the micro-scale...
Article
In recent years, promising deep learning based interatomic potential energy surface (PES) models have been proposed that can potentially allow us to perform molecular dynamics simulations for large scale systems with quantum accuracy. However, making these models truly reliable and practically useful is still a very non-trivial task. A key componen...
Article
A fairly comprehensive analysis is presented for the gradient descent dynamics for training two-layer neural network models in the situation when the parameters in both layers are updated. General initialization schemes as well as general regimes for the network width and training data size are considered. In the over-parametrized regime, it is sho...
Preprint
Full-text available
We present a continuous formulation of machine learning, as a problem in the calculus of variations and differential-integral equations, very much in the spirit of classical numerical analysis and statistical physics. We demonstrate that conventional machine learning models and algorithms, such as the random feature model, the shallow neural networ...
Preprint
Full-text available
We study the generalization properties of minimum-norm solutions for three over-parametrized machine learning models including the random feature model, the two-layer neural network model and the residual network model. We proved that for all three models, the generalization error for the minimum-norm solution is comparable to the Monte Carlo rate,...
Preprint
Full-text available
In recent years, promising deep learning based interatomic potential energy surface (PES) models have been proposed that can potentially allow us to perform molecular dynamics simulations for large scale systems with quantum accuracy. However, making these models truly reliable and practically useful is still a very non-trivial task. A key componen...
Article
Full-text available
A framework is introduced for constructing interpretable and truly reliable reduced models for multiscale problems in situations without scale separation. Hydrodynamic approximation to the kinetic equation is used as an example to illustrate the main steps and issues involved. To this end, a set of generalized moments are constructed first to optim...
Article
A comprehensive microscopic understanding of ambient liquid water is a major challenge for ab initio simulations as it simultaneously requires an accurate quantum mechanical description of the underlying potential energy surface (PES) as well as extensive sampling of configuration space. Due to the presence of light atoms (e.g. H or D), nuclear qua...
Article
We introduce a new family of trial wave-functions based on deep neural networks to solve the many-electron Schrödinger equation. The Pauli exclusion principle is dealt with explicitly to ensure that the trial wave-functions are physical. The optimal trial wave-function is obtained through variational Monte Carlo and the computational cost scales qu...
Article
Full-text available
High-dimensional partial differential equations (PDE) appear in a number of models from the financial industry, such as in derivative pricing models, credit valuation adjustment (CVA) models, or portfolio optimization models. The PDEs in such applications are high-dimensional as the dimension corresponds to the number of financial assets in a portf...
Preprint
Full-text available
Inspired by chemical kinetics and neurobiology, we propose a mathematical theory for pattern recurrence in text documents, applicable to a wide variety of languages. We present a Markov model at the discourse level for Steven Pinker's ``mentalese'', or chains of mental states that transcend the spoken/written forms. Such (potentially) universal tem...