Figure - available from: SN Applied Sciences
This content is subject to copyright. Terms and conditions apply.
Source publication
In this work, we are concerned with neural network guided goal-oriented a posteriori error estimation and adaptivity using the dual weighted residual method. The primal problem is solved using classical Galerkin finite elements. The adjoint problem is solved in strong form with a feedforward neural network using two or three hidden layers. The main...
Citations
... These include the test function choice [7], forward solve [8], adjoint solve [9], derivative recovery procedure [10], error estimation [11], metric/monitor function/sizing field construction step [12,13,14], and the entire mesh adaptation loop [6,15,16]. ...
... A similar 'focused' approach is also used in [9], which emulates the adjoint solve procedure. This is done on the base mesh and the data-driven adjoint solution is projected into an enriched space, where error indicators are assembled and thereby used to drive mesh adaptation. ...
Given a partial differential equation (PDE), goal-oriented error estimation allows us to understand how errors in a diagnostic quantity of interest (QoI), or goal, occur and accumulate in a numerical approximation, for example using the finite element method. By decomposing the error estimates into contributions from individual elements, it is possible to formulate adaptation methods, which modify the mesh with the objective of minimising the resulting QoI error. However, the standard error estimate formulation involves the true adjoint solution, which is unknown in practice. As such, it is common practice to approximate it with an 'enriched' approximation (e.g. in a higher order space or on a refined mesh). Doing so generally results in a significant increase in computational cost, which can be a bottleneck compromising the competitiveness of (goal-oriented) adaptive simulations. The central idea of this paper is to develop a "data-driven" goal-oriented mesh adaptation approach through the selective replacement of the expensive error estimation step with an appropriately configured and trained neural network. In doing so, the error estimator may be obtained without even constructing the enriched spaces. An element-by-element construction is employed here, whereby local values of various parameters related to the mesh geometry and underlying problem physics are taken as inputs, and the corresponding contribution to the error estimator is taken as output. We demonstrate that this approach is able to obtain the same accuracy with a reduced computational cost, for adaptive mesh test cases related to flow around tidal turbines, which interact via their downstream wakes, and where the overall power output of the farm is taken as the QoI. Moreover, we demonstrate that the element-by-element approach implies reasonably low training costs.
... For those related to the deep Ritz method; see Xu [ [11]. Recently, a posteriori error analysis has also been studied, in particular goal-oriented analysis using the dual-weighted residual (DWR) methodology; see, e.g., Roth, Schröder and Wick [46], Minakowski & Richter [32] and Chakraborty, Wick, Zhuang & Rabczuk [12]. We note that in our current work, while we have in mind the error analysis for neural-control approximations, the abstract analysis presented in Section 2 is essentially an extension of the above-mentioned a priori analysis to a certain class of problems involving a convex and differentiable cost functional. ...
There is tremendous potential in using neural networks to optimize numerical methods. In this paper, we introduce and analyse a framework for the neural optimization of discrete weak formulations, suitable for finite element methods. The main idea of the framework is to include a neural-network function acting as a control variable in the weak form. Finding the neural control that (quasi-) minimizes a suitable cost (or loss) functional, then yields a numerical approximation with desirable attributes. In particular, the framework allows in a natural way the incorporation of known data of the exact solution, or the incorporation of stabilization mechanisms (e.g., to remove spurious oscillations). The main result of our analysis pertains to the well-posedness and convergence of the associated constrained-optimization problem. In particular, we prove under certain conditions, that the discrete weak forms are stable, and that quasi-minimizing neural controls exist, which converge quasi-optimally. We specialize the analysis results to Galerkin, least-squares and minimal-residual formulations, where the neural-network dependence appears in the form of suitable weights. Elementary numerical experiments support our findings and demonstrate the potential of the framework.
... Using tools from differential equations for studying the stability of neural network architectures was subject in [22]. We also have results in error control and adaptivity and recently approximated the adjoint differential equa- tion with the help of a neural network [23] (we also refer to similar work for goal-oriented finite element discretizations [24]). From a mathematical-numerics viewpoint the following aspects are important according to our own experiences (but see also [18], p. 213): the loss function is nonconvex and nonlinear and therefore the solution is not unique; sensitivity of initial guesses for the nonlinear iteration (e.g., stochastic gradient descent, Adam's algorithm, quasi-Newton method such as L-BFGS, or full Newton-type methods) yielding solutions in 'wrong' local minima; number and distribution of scattered/collocation points, size of the neural network (number of hidden layers and neurons per layer), computational cost associated to the nonlinear solution (speed of convergence, number of iterations, choice of learning rate parameter), over-fitting such that noise in given data is trained as well. ...
In this work, we discuss some pitfalls when solving differential equations with neural networks. Due to the highly nonlinear cost functional, local minima might be approximated by which functions may be obtained, that do not solve the problem. The main reason for these failures is a sensitivity on initial guesses for the nonlinear iteration. We apply known algorithms and corresponding implementations, including code snippets, and present an example and counter example for the logistic differential equations. These findings are further substantiated with variations in collocation points and learning rates.
... Using ansatz functions of the above form has become increasingly popular and it has been observed that it simplifies the training process and produces more accurate solutions, see for instance Berg and Nyström (2018); Roth et al. (2021); Lyu et al. (2020); Chen et al. (2020). It is also possible to encode Neumann or Robin boundary conditions in a similar way, we refer the reader to Lyu et al. (2020). ...
We analyse the difference in convergence mode using exact versus penalised boundary values for the residual minimisation of PDEs with neural network type ansatz functions, as is commonly done in the context of physics informed neural networks. It is known that using an $L^2$ boundary penalty leads to a loss of regularity of $3/2$ meaning that approximation in $H^2$ yields a priori estimates in $H^{1/2}$. These notes demonstrate how this loss of regularity can be circumvented if the functions in the ansatz class satisfy the boundary values exactly. Furthermore, it is shown that in this case, the loss function provides a consistent a posteriori error estimator in $H^2$ norm made by the residual minimisation method. We provide analogue results for linear time dependent problems and discuss the implications of measuring the residual in Sobolev norms.
There is tremendous potential in using neural networks to optimize numerical methods. In this paper, we introduce and analyze a framework for the neural optimization of discrete weak formulations, suitable for finite element methods. The main idea of the framework is to include a neural-network function acting as a control variable in the weak form. Finding the neural control that (quasi-) minimizes a suitable cost (or loss) functional, then yields a numerical approximation with desirable attributes. In particular, the framework allows in a natural way the incorporation of known data of the exact solution, or the incorporation of stabilization mechanisms (e.g., to remove spurious oscillations).
The main result of our analysis pertains to the well-posedness and convergence of the associated constrained-optimization problem. In particular, we prove under certain conditions, that the discrete weak forms are stable, and that quasi-minimizing neural controls exist, which converge quasi-optimally. We specialize the analysis results to Galerkin, least squares and minimal-residual formulations, where the neural-network dependence appears in the form of suitable weights. Elementary numerical experiments support our findings and demonstrate the potential of the framework.
We analyze neural network solutions to partial differential equations obtained with Physics Informed Neural Networks. In particular, we apply tools of classical finite element error analysis to obtain conclusions about the error of the Deep Ritz method applied to the Laplace and the Stokes equations. Further, we develop an a posteriori error estimator for neural network approximations of partial differential equations. The proposed approach is based on the dual weighted residual estimator. It is destined to serve as a stopping criterion that guarantees the accuracy of the solution independently of the design of the neural network training. The result is equipped with computational examples for Laplace and Stokes problems.