Weinan Ee

Weinan Ee
Princeton University | PU · Department of Mathematics

About

374
Publications
110,603
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
32,847
Citations

Publications

Publications (374)
Preprint
Automated drug discovery offers significant potential for accelerating the development of novel therapeutics by substituting labor-intensive human workflows with machine-driven processes. However, a critical bottleneck persists in the inability of current automated frameworks to assess whether newly designed molecules infringe upon existing patents...
Article
Full-text available
Molecular dynamics (MD) is an indispensable atomistic-scale computational tool widely-used in various disciplines. In the past decades, nearly all ab initio MD and machine-learning MD have been based on the general-purpose central/graphics processing units (CPU/GPU), which are well-known to suffer from their intrinsic “memory wall” and “power wall”...
Preprint
Full-text available
We report an extensive molecular dynamics study of ab-initio quality of the ferroelectric phase transition in crystalline PbTiO3. We model anharmonicity accurately in terms of potential energy and polarization surfaces trained on density functional theory data with modern machine learning techniques. Our simulations demonstrate that the transition...
Preprint
Full-text available
The study of structure-spectrum relationships is essential for spectral interpretation, impacting structural elucidation and material design. Predicting spectra from molecular structures is challenging due to their complex relationships. Herein, we introduce NMRNet, a deep learning framework using the SE(3) Transformer for atomic environment modeli...
Preprint
Full-text available
Advancements in lithium battery technology heavily rely on the design and engineering of electrolytes. However, current schemes for molecular design and recipe optimization of electrolytes lack an effective computational-experimental closed loop and often fall short in accurately predicting diverse electrolyte formulation properties. In this work,...
Preprint
Full-text available
In recent years, pretraining models have made significant advancements in the fields of natural language processing (NLP), computer vision (CV), and life sciences. The significant advancements in NLP and CV are predominantly driven by the expansion of model parameters and data size, a phenomenon now recognized as the scaling laws. However, research...
Preprint
Full-text available
Accurately sampling of protein conformations is pivotal for advances in biology and medicine. Although there have been tremendous progress in protein structure prediction in recent years due to deep learning, models that can predict the different stable conformations of proteins with high accuracy and structural validity are still lacking. Here, we...
Preprint
Full-text available
A data-driven ab initio generalized Langevin equation (AIGLE) approach is developed to learn and simulate high-dimensional, heterogeneous, coarse-grained conformational dynamics. Constrained by the fluctuation-dissipation theorem, the approach can build coarse-grained models in dynamical consistency with all-atom molecular dynamics. We also propose...
Preprint
Full-text available
The rapid development of artificial intelligence (AI) is driving significant changes in the field of atomic modeling, simulation, and design. AI-based potential energy models have been successfully used to perform large-scale and long-time simulations with the accuracy of ab initio electronic structure methods. However, the model generation process...
Article
Full-text available
Machine learning has been widely used for solving partial differential equations (PDEs) in recent years, among which the random feature method (RFM) exhibits spectral accuracy and can compete with traditional solvers in terms of both accuracy and efficiency. Potentially, the optimization problem in the RFM is more difficult to solve than those that...
Article
We introduce a machine learning–based approach called ab initio generalized Langevin equation (AIGLE) to model the dynamics of slow collective variables (CVs) in materials and molecules. In this scheme, the parameters are learned from atomistic simulations based on ab initio quantum mechanical models. Force field, memory kernel, and noise generator...
Article
Full-text available
DeePMD-kit is a powerful open-source software package that facilitates molecular dynamics simulations using machine learning potentials known as Deep Potential (DP) models. This package, which was released in 2017, has been widely used in the fields of physics, chemistry, biology, and material science for studying atomistic systems. The current ver...
Preprint
Full-text available
Simulating electronic behavior in materials and devices with realistic large system sizes remains a formidable task within the $ab$ $initio$ framework. We propose DeePTB, an efficient deep learning-based tight-binding (TB) approach with $ab$ $initio$ accuracy to address this issue. By training with $ab$ $initio$ eigenvalues, our method can efficien...
Article
Full-text available
Unraveling the reaction paths and structural evolutions during charging/discharging processes are critical for the development and tailoring of silicon anodes for high‐capacity batteries. However, a mechanistic understanding is still lacking due to the complex phase transformations between crystalline (c‐) and amorphous (a‐) phases involved in elec...
Article
We propose a quantum Monte Carlo approach to solve the many-body Schrödinger equation for the electronic ground state. The method combines optimization from variational Monte Carlo and propagation from auxiliary field quantum Monte Carlo in a way that significantly alleviates the sign problem. In application to molecular systems, we obtain highly a...
Preprint
Full-text available
DeePMD-kit is a powerful open-source software package that facilitates molecular dynamics simulations using machine learning potentials (MLP) known as Deep Potential (DP) models. This package, which was released in 2017, has been widely used in the fields of physics, chemistry, biology, and material science for studying atomistic systems. The curre...
Preprint
Full-text available
We present a framework for solving time-dependent partial differential equations (PDEs) in the spirit of the random feature method. The numerical solution is constructed using a space-time partition of unity and random feature functions. Two different ways of constructing the random feature functions are investigated: feature functions that treat t...
Preprint
Full-text available
Silicon has been extensively studied as one of the most promising anode materials for next-generation lithium-ion batteries because of its high specific capacity. However, a direct understanding of the atomic-scale mechanism associated with the charging and discharging processes for the Silicon anode is still lacking, partly due to the fact that th...
Article
Solving complex optimal control problems have confronted computational challenges for a long time. Recent advances in machine learning have provided us with new opportunities to address these challenges. This paper takes model predictive control, a popular optimal control method, as the primary example to survey recent progress that leverages machi...
Preprint
We propose a quantum Monte Carlo approach to solve the ground state many-body Schrodinger equation for the electronic ground state. The method combines optimization from variational Monte Carlo and propagation from auxiliary field quantum Monte Carlo, in a way that significantly alleviates the sign problem. In application to molecular systems, we o...
Preprint
Full-text available
We propose an approach for learning accurately the dynamics of slow collective variables from atomistic data obtained from ab-initio quantum mechanical theory, using generalized Langevin equations (GLE). The force fields, memory kernel, and noise generator are constructed within the Mori-Zwanzig formalism under the constraint imposed by the fluctua...
Preprint
Closed-loop optimal control design for high-dimensional nonlinear systems has been a long-standing problem. Traditional methods, such as solving the associated Hamilton-Jacobi-Bellman equation, suffer from the curse of dimensionality. Recent literature proposed a new promising approach based on supervised learning, by leveraging powerful open-loop...
Article
We introduce DeePKS-kit, an open-source software package for developing machine learning based energy and density functional models. DeePKS-kit is interfaced with PyTorch, an open-source machine learning library, and PySCF, an ab initio computational chemistry program that provides simple and customized tools for developing quantum chemistry codes....
Preprint
One of the oldest and most studied subject in scientific computing is algorithms for solving partial differential equations (PDEs). A long list of numerical methods have been proposed and successfully used for various applications. In recent years, deep learning methods have shown their superiority for high-dimensional PDEs where traditional method...
Preprint
Full-text available
We report an ab initio multi-scale study of lead titanate using the Deep Potential (DP) models, a family of machine learning-based atomistic models, trained on first-principles density functional theory data, to represent potential and polarization surfaces. Our approach includes anharmonic effects beyond the limitations of reduced models and of th...
Preprint
Collisions are common in many dynamical systems with real applications. They can be formulated as hybrid dynamical systems with discontinuities automatically triggered when states transverse certain manifolds. We present an algorithm for the optimal control problem of such hybrid dynamical systems, based on solving the equations derived from the hy...
Preprint
Solving complex optimal control problems have confronted computation challenges for a long time. Recent advances in machine learning have provided us with new opportunities to address these challenges. This paper takes the model predictive control, a popular optimal control method, as the primary example to survey recent progress that leverages mac...
Article
Full-text available
To fill the gap between accurate (and expensive) ab initio calculations and efficient atomistic simulations based on empirical interatomic potentials, a new class of descriptions of atomic interactions has emerged and been widely applied; i.e., machine learning potentials (MLPs). One recently developed type of MLP is the Deep Potential (DP) method....
Article
Full-text available
We study the natural function space for infinitely wide two-layer neural networks with ReLU activation (Barron space) and establish different representation formulae. In two cases, we describe the space explicitly up to isomorphism. Using a convenient representation, we study the pointwise properties of two-layer networks and show that functions wh...
Article
Full-text available
Machine learning models for the potential energy of multi-atomic systems, such as the deep potential (DP) model, make molecular simulations with the accuracy of quantum mechanical density functional theory possible at a cost only moderately higher than that of empirical force fields. However, the majority of these models lack explicit long-range in...
Preprint
We propose a machine learning enhanced algorithm for solving the optimal landing problem. Using Pontryagin's minimum principle, we derive a two-point boundary value problem for the landing problem. The proposed algorithm uses deep learning to predict the optimal landing time and a space-marching technique to provide good initial guesses for the bou...
Article
Full-text available
The generative adversarial network (GAN) is a well-known model for learning high-dimensional distributions, but the mechanism for its generalization ability is not understood. In particular, GAN is vulnerable to the memorization phenomenon, the eventual convergence to the empirical distribution. We consider a simplified GAN model with the generator...
Preprint
Full-text available
To fill the gap between accurate (and expensive) ab initio calculations and efficient atomistic simulations based on empirical interatomic potentials, a new class of descriptions of atomic interactions has emerged and been widely applied; i.e., machine learning potentials (MLPs). One recently developed type of MLP is the Deep Potential (DP) method....
Article
Full-text available
One of the key issues in the analysis of machine learning models is to identify the appropriate function space and norm for the model. This is the set of functions endowed with a quantity which can control the approximation and estimation errors by a particular machine learning model. In this paper, we address this issue for two representative neur...
Article
Full-text available
Enhanced sampling methods such as metadynamics and umbrella sampling have become essential tools for exploring the configuration space of molecules and materials. At the same time, they have long faced a number of issues such as the inefficiency when dealing with a large number of collective variables (CVs) or systems with high free energy barriers...
Preprint
Full-text available
A long standing problem in the modeling of non-Newtonian hydrodynamics is the availability of reliable and interpretable hydrodynamic models that faithfully encode the underlying micro-scale polymer dynamics. The main complication arises from the long polymer relaxation time, the complex molecular structure, and heterogeneous interaction. DeePN$^2$...
Preprint
We propose an efficient, reliable, and interpretable global solution method, $\textit{Deep learning-based algorithm for Heterogeneous Agent Models, DeepHAM}$, for solving high dimensional heterogeneous agent models with aggregate shocks. The state distribution is approximately represented by a set of optimal generalized moments. Deep neural network...
Preprint
Full-text available
Machine learning models for the potential energy of multi-atomic systems, such as the deep potential (DP) model, make possible molecular simulations with the accuracy of quantum mechanical density functional theory, at a cost only moderately higher than that of empirical force fields. However, the majority of these models lack explicit long-range i...
Article
Full-text available
In recent years, tremendous progress has been made on numerical algorithms for solving partial differential equations (PDEs) in a very high dimension, using ideas from either nonlinear (multilevel) Monte Carlo or deep learning. They are potentially free of the curse of dimensionality for many different applications and have been proven to be so in...
Article
Full-text available
We introduce a new family of numerical algorithms for approximating solutions of general high-dimensional semilinear parabolic partial differential equations at single space-time points. The algorithm is obtained through a delicate combination of the Feynman–Kac and the Bismut–Elworthy–Li formulas, and an approximate decomposition of the Picard fix...
Article
We propose a systematic method for learning stable and physically interpretable dynamical models using sampled trajectory data from physical processes based on a generalized Onsager principle. The learned dynamics are autonomous ordinary differential equations parametrized by neural networks that retain clear physical structure information, such as...
Preprint
Full-text available
The generative adversarial network (GAN) is a well-known model for learning high-dimensional distributions, but the mechanism for its generalization ability is not understood. In particular, GAN is vulnerable to the memorization phenomenon, the eventual convergence to the empirical distribution. We consider a simplified GAN model with the generator...
Article
By integrating artificial intelligence algorithms and physics-based simulations, researchers are developing new models that are both reliable and interpretable.
Article
Full-text available
Using the Deep Potential methodology, we construct a model that reproduces accurately the potential energy surface of the SCAN approximation of density functional theory for water, from low temperature and pressure to about 2400 K and 50 GPa, excluding the vapor stability region. The computational efficiency of the model makes it possible to predic...
Preprint
Full-text available
Reinforcement learning (RL) algorithms based on high-dimensional function approximation have achieved tremendous empirical success in large-scale problems with an enormous number of states. However, most analysis of such algorithms gives rise to error bounds that involve either the number of states or the number of features. This paper considers th...
Preprint
Full-text available
Enhanced sampling methods such as metadynamics and umbrella sampling have become essential tools for exploring the configuration space of molecules and materials. At the same time, they have long faced the following dilemma: Since they are only effective with a small number of collective variables (CVs), choosing a proper set of CVs becomes critica...
Preprint
We propose a unified framework that extends the inference methods for classical hidden Markov models to continuous settings, where both the hidden states and observations occur in continuous time. Two different settings are analyzed: (1) hidden jump process with a finite state space; (2) hidden diffusion process with a continuous state space. For e...
Article
Solid-state electrolyte materials with superior lithium ionic conductivities are vital to the next-generation Li-ion batteries. Molecular dynamics could provide atomic scale information to understand the diffusion process of Li-ion in these superionic conductor materials. Here, we implement the deep potential generator to set up an efficient protoc...
Article
Full-text available
We establish a scale separation of Kolmogorov width type between subspaces of a given Banach space under the condition that a sequence of linear maps converges much faster on one of the subspaces. The general technique is then applied to show that reproducing kernel Hilbert spaces are poor L2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepa...
Preprint
Full-text available
Using the Deep Potential methodology, we construct a model that reproduces accurately the potential energy surface of the SCAN approximation of density functional theory for water, from low temperature and pressure to about 2400 K and 50 GPa, excluding the vapor stability region. The computational efficiency of the model makes it possible to predic...
Article
We present the GPU version of DeePMD-kit, which, upon training a deep neural network model using ab initio data, can drive extremely large-scale molecular dynamics (MD) simulation with ab initio accuracy. Our tests show that for a water system of 12,582,912 atoms, the GPU version can be 7 times faster than the CPU version under the same power consu...
Article
We propose the coarse-grained spectral projection method (CGSP), a deep learning assisted approach for tackling quantum unitary dynamic problems with an emphasis on quench dynamics. We show that CGSP can extract spectral components of many-body quantum states systematically with a sophisticated neural network quantum ansatz. CGSP fully exploits the...
Preprint
Full-text available
We introduce DeePKS-kit, an open-source software package for developing machine learning based energy and density functional models. DeePKS-kit is interfaced with PyTorch, an open-source machine learning library, and PySCF, an ab initio computational chemistry program that provides simple and customized tools for developing quantum chemistry codes....
Preprint
A recent numerical study observed that neural network classifiers enjoy a large degree of symmetry in the penultimate layer. Namely, if $h(x) = Af(x) +b$ where $A$ is a linear map and $f$ is the output of the penultimate layer of the network (after activation), then all data points $x_{i, 1}, \dots, x_{i, N_i}$ in a class $C_i$ are mapped to a sing...
Article
We propose a general machine learning-based framework for building an accurate and widely applicable energy functional within the framework of generalized Kohn-Sham density functional theory. To this end, we develop a way of training self-consistent models that are capable of taking large datasets from different systems and different kinds of label...
Preprint
We use explicit representation formulas to show that solutions to certain partial differential equations can be represented efficiently using artificial neural networks, even in high dimension. Conversely, we present examples in which the solution fails to lie in the function space associated to a neural network under consideration.
Preprint
Full-text available
Models for learning probability distributions such as generative models and density estimators behave quite differently from models for learning functions. One example is found in the memorization phenomenon, namely the ultimate convergence to the empirical distribution, that occurs in generative adversarial networks (GANs). For this reason, the is...
Article
We introduce a machine-learning-based framework for constructing continuum a non-Newtonian fluid dynamics model directly from a microscale description. Dumbbell polymer solutions are used as examples to demonstrate the essential ideas. To faithfully retain molecular fidelity, we establish a micro-macro correspondence via a set of encoders for the m...
Preprint
It is not clear yet why ADAM-alike adaptive gradient algorithms suffer from worse generalization performance than SGD despite their faster training speed. This work aims to provide understandings on this generalization gap by analyzing their local convergence behaviors. Specifically, we observe the heavy tails of gradient noise in these algorithms....
Article
We prove that the gradient descent training of a two-layer neural network on empirical or population risk may not decrease population risk at an order faster than t <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">-4</sup> /(d <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w...
Preprint
We consider binary and multi-class classification problems using hypothesis classes of neural networks. For a given hypothesis class, we use Rademacher complexity estimates and direct approximation theorems to obtain a priori error estimates for regularized loss functionals.
Preprint
Neural network-based machine learning is capable of approximating functions in very high dimension with unprecedented efficiency and accuracy. This has opened up many exciting new possibilities, not just in traditional areas of artificial intelligence, but also in scientific computing and computational science. At the same time, machine learning ha...
Preprint
Full-text available
The purpose of this article is to review the achievements made in the last few years towards the understanding of the reasons behind the success and subtleties of neural network-based machine learning. In the tradition of good old applied mathematics, we will not only give attention to rigorous mathematical results, but also the insight we have gai...
Preprint
Full-text available
We study the approximation properties and optimization dynamics of recurrent neural networks (RNNs) when applied to learn input-output relationships in temporal data. We consider the simple but representative setting of using continuous-time linear RNNs to learn from data generated by linear relationships. Mathematically, the latter can be understo...
Preprint
The dynamic behavior of RMSprop and Adam algorithms is studied through a combination of careful numerical experiments and theoretical explanations. Three types of qualitative features are observed in the training loss curve: fast initial convergence, oscillations and large spikes. The sign gradient descent (signGD) algorithm, which is the limit of...
Article
We characterize the meaning of words with language-independent numerical fingerprints, through a mathematical analysis of recurring patterns in texts. Approximating texts by Markov processes on a long-range time scale, we are able to extract topics, discover synonyms, and sketch semantic fields from a particular document of moderate length, without...
Preprint
Full-text available
We propose a systematic method for learning stable and interpretable dynamical models using sampled trajectory data from physical processes based on a generalized Onsager principle. The learned dynamics are autonomous ordinary differential equations parameterized by neural networks that retain clear physical structure information, such as free ener...
Article
Full-text available
We present a continuous formulation of machine learning, as a problem in the calculus of variations and differential-integral equations, in the spirit of classical numerical analysis. We demonstrate that conventional machine learning models and algorithms, such as the random feature model, the two-layer neural network model and the residual neural...
Preprint
Full-text available
In recent years, tremendous progress has been made on numerical algorithms for solving partial differential equations (PDEs) in a very high dimension, using ideas from either nonlinear (multilevel) Monte Carlo or deep learning. They are potentially free of the curse of dimensionality for many different applications and have been proven to be so in...
Preprint
The random feature model exhibits a kind of resonance behavior when the number of parameters is close to the training sample size. This behavior is characterized by the appearance of large generalization gap, and is due to the occurrence of very small eigenvalues for the associated Gram matrix. In this paper, we examine the dynamic behavior of the...
Article
We introduce the Deep Post–Hartree–Fock (DeePHF) method, a machine learning based scheme for constructing accurate and transferable models for the ground-state energy of electronic structure problems. DeePHF predicts the energy difference between results of highly accurate models such as the coupled cluster method and low accuracy models such as the...
Preprint
Full-text available
We propose a general machine learning-based framework for building an accurate and widely-applicable energy functional within the framework of generalized Kohn-Sham density functional theory. To this end, we develop a way of training self-consistent models that are capable of taking large datasets from different systems and different kinds of label...
Preprint
We develop Banach spaces for ReLU neural networks of finite depth $L$ and infinite width. The spaces contain all finite fully connected $L$-layer networks and their $L^2$-limiting objects under bounds on the natural path-norm. Under this norm, the unit ball in the space for $L$-layer networks has low Rademacher complexity and thus favorable general...
Article
We introduce a deep neural network to model in a symmetry preserving way the environmental dependence of the centers of the electronic charge. The model learns from ab initio density functional theory, wherein the electronic centers are uniquely assigned by the maximally localized Wannier functions. When combined with the deep potential model of th...
Preprint
We propose the coarse-grained spectral projection method (CGSP), a deep learning approach for tackling quantum unitary dynamic problems with an emphasis on quench dynamics. We show CGSP can extract spectral components of many-body quantum states systematically with highly entangled neural network quantum ansatz. CGSP exploits fully the linear unita...
Preprint
A numerical and phenomenological study of the gradient descent (GD) algorithm for training two-layer neural network models is carried out for different parameter regimes when the target function can be accurately approximated by a relatively small number of neurons. It is found that for Xavier-like initialization, there are two distinctive phases i...
Preprint
We study the natural function space for infinitely wide two-layer neural networks with ReLU activation (Barron space) and establish different representation formulae. In two cases, we describe the space explicitly up to isomorphism. Using a convenient representation, we study the pointwise properties of two-layer networks and show that functions wh...