Philipp Christian Petersen

Philipp Christian Petersen
Verified
Philipp verified their affiliation via an institutional email.
Verified
Philipp verified their affiliation via an institutional email.
  • Ph.D.
  • Professor (Associate) at University of Vienna

About

72
Publications
43,881
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,233
Citations
Introduction
Philipp Christian Petersen currently works at the Department of Mathematics, University of Vienna. Philipp does research in Approximation theory, Neural Networks, and Signal and Image Processing.
Current institution
University of Vienna
Current position
  • Professor (Associate)

Publications

Publications (72)
Article
Full-text available
We study the necessary and sufficient complexity of ReLU neural networks---in terms of depth and number of weights---which is required for approximating classifier functions in an $L^2$-sense. As a model class, we consider the set $\mathcal{E}^\beta (\mathbb R^d)$ of possibly discontinuous piecewise $C^\beta$ functions $f : [-1/2, 1/2]^d \to \mathb...
Article
Full-text available
We analyze the topological properties of the set of functions that can be implemented by neural networks of a fixed size. Surprisingly, this set has many undesirable properties: It is highly non-convex, except possibly for a few exotic activation functions. Moreover, the set is not closed with respect to $L^p$-norms, $0<p<\infty$, for all practical...
Article
Full-text available
Microlocal analysis provides deep insight into singularity structures and is often crucial for solving inverse problems, predominately, in imaging sciences. Of particular importance is the analysis of wavefront sets and the correct extraction of those. In this paper, we introduce the first algorithmic approach to extract the wavefront set of images...
Preprint
Full-text available
We prove exponential expressivity with stable ReLU Neural Networks (ReLU NNs) in $H^1(\Omega)$ for weighted analytic function classes in certain polytopal domains $\Omega$, in space dimension $d=2,3$. Functions in these classes are locally analytic on open subdomains $D\subset \Omega$, but may exhibit isolated point singularities in the interior of...
Preprint
Full-text available
We perform a comprehensive numerical study of the effect of approximation-theoretical results for neural networks on practical learning problems in the context of numerical analysis. As the underlying model, we study the machine-learning-based solution of parametric partial differential equations. Here, approximation theory predicts that the perfor...
Preprint
Deep learning's success comes with growing energy demands, raising concerns about the long-term sustainability of the field. Spiking neural networks, inspired by biological neurons, offer a promising alternative with potential computational and energy-efficiency gains. This article examines the computational properties of spiking networks through t...
Preprint
Full-text available
We prove that a classifier with a Barron-regular decision boundary can be approximated with a rate of high polynomial degree by ReLU neural networks with three hidden layers when a margin condition is assumed. In particular, for strong margin conditions, high-dimensional discontinuous classifiers can be approximated with a rate that is typically on...
Preprint
In recent work it has been shown that determining a feedforward ReLU neural network to within high uniform accuracy from point samples suffers from the curse of dimensionality in terms of the number of samples needed. As a consequence, feedforward ReLU neural networks are of limited use for applications where guaranteed high uniform accuracy is req...
Preprint
Full-text available
We introduce a conceptual framework for numerically solving linear elliptic, parabolic, and hyperbolic PDEs on bounded, polytopal domains in euclidean spaces by deep neural networks. The PDEs are recast as minimization of a least-squares (LSQ for short) residual of an equivalent, well-posed first-order system, over parametric families of deep neura...
Preprint
Full-text available
We study the problem of approximating and estimating classification functions that have their decision boundary in the $RBV^2$ space. Functions of $RBV^2$ type arise naturally as solutions of regularized neural network learning problems and neural networks can approximate these functions without the curse of dimensionality. We modify existing resul...
Article
Full-text available
Introduction Polarized endurance training is an important and frequently discussed training intensity distribution (TID). The polarized TID is described as the largest fraction of training time or sessions spent with low-intensity exercise in intensity zone (z)1, followed by a considerable fraction of high intensity exercise (z3), and a relatively...
Preprint
Full-text available
This book provides an introduction to the mathematical analysis of deep learning. It covers fundamental results in approximation theory, optimization theory, and statistical learning theory, which are the three main pillars of deep neural network theory. Serving as a guide for students and researchers in mathematics and related fields, the book aim...
Preprint
Full-text available
The generalized Gauss-Newton (GGN) optimization method incorporates curvature estimates into its solution steps, and provides a good approximation to the Newton method for large-scale optimization problems. GGN has been found particularly interesting for practical training of deep neural networks, not only for its impressive convergence speed, but...
Preprint
Full-text available
We study the learning problem associated with spiking neural networks. Specifically, we consider hypothesis sets of spiking neural networks with affine temporal en-coders and decoders and simple spiking neurons having only positive synaptic weights. We demonstrate that the positivity of the weights continues to enable a wide range of expressivity r...
Article
Full-text available
We study the training of deep neural networks by gradient descent where floating-point arithmetic is used to compute the gradients. In this framework and under realistic assumptions, we demonstrate that it is highly unlikely to find ReLU neural networks that maintain, in the course of training with gradient descent, superlinearly many affine pieces...
Preprint
Full-text available
We investigate the mathematical capabilities of two iterations of ChatGPT (released 9-January-2023 and 30-January-2023) and of GPT-4 by testing them on publicly available datasets, as well as hand-crafted ones, using a novel methodology. In contrast to formal mathematics, where large databases of formal proofs are available (e.g., the Lean Mathemat...
Chapter
Full-text available
In recent years the development of new classification and regression algorithms based on deep learning has led to a revolution in the fields of artificial intelligence, machine learning, and data analysis. The development of a theoretical foundation to guarantee the success of these algorithms constitutes one of the most active and exciting researc...
Preprint
Full-text available
We study the generalization capacity of group convolutional neural networks. We identify precise estimates for the VC dimensions of simple sets of group convolutional neural networks. In particular, we find that for infinite groups and appropriately chosen convolutional kernels, already two-parameter families of convolutional neural networks have a...
Article
We study the problem of reconstructing solutions of inverse problems when only noisy measurements are available. We assume that the problem can be modeled with an infinite-dimensional forward operator that is not continuously invertible. Then, we restrict this forward operator to finite-dimensional spaces so that the inverse is Lipschitz continuous...
Preprint
Full-text available
We study the training of deep neural networks by gradient descent where floating-point arithmetic is used to compute the gradients. In this framework and under realistic assumptions, we demonstrate that it is highly unlikely to find ReLU neural networks that maintain, in the course of training with gradient descent, superlinearly many affine pieces...
Preprint
Full-text available
We study the problem of reconstructing solutions of inverse problems with neural networks when only noisy data is available. We assume the problem can be modeled with an infinite-dimensional forward operator that is not continuously invertible. Then, we restrict this forward operator to finite-dimensional spaces so that the inverse is Lipschitz con...
Article
Full-text available
In certain polytopal domains Ω, in space dimension d=2,3, we prove exponential expressivity with stable ReLU Neural Networks (ReLU NNs) in H1(Ω) for weighted analytic function classes. These classes comprise in particular solution sets of source and eigenvalue problems for elliptic PDEs with analytic data. Functions in these classes are locally ana...
Article
Full-text available
We derive upper bounds on the complexity of ReLU neural networks approximating the solution maps of parametric partial differential equations. In particular, without any knowledge of its concrete shape, we use the inherent low dimensionality of the solution manifold to obtain approximation rates which are significantly superior to those provided by...
Article
Full-text available
We present a deep-learning-based algorithm to jointly solve a reconstruction problem and a wavefront set extraction problem in tomographic imaging. The algorithm is based on a recently developed digital wavefront set extractor as well as the well-known microlocal canonical relation for the Radon transform. We use the wavefront set information about...
Preprint
Full-text available
We study the problem of learning classification functions from noiseless training samples, under the assumption that the decision boundary is of a certain regularity. We establish universal lower bounds for this estimation problem, for general classes of continuous decision boundaries. For the class of locally Barron-regular decision boundaries, we...
Article
Full-text available
Wir beschreiben das neue Feld der mathematischen Analyse des tiefen Lernens (engl. deep learning). Dieses Feld entwickelte sich angetrieben von Forschungsfragen, die nicht ausreichend durch klassische Lerntheorie beantwortet werden konnten. Diese Fragen betreffen unter anderem: die außergewöhnlich genauen Vorhersagen von überparametrisierten neuron...
Preprint
Full-text available
We present a deep learning-based algorithm to jointly solve a reconstruction problem and a wavefront set extraction problem in tomographic imaging. The algorithm is based on a recently developed digital wavefront set extractor as well as the well-known microlocal canonical relation for the Radon transform. We use the wavefront set information about...
Article
Full-text available
We perform a comprehensive numerical study of the effect of approximation-theoretical results for neural networks on practical learning problems in the context of numerical analysis. As the underlying model, we study the machine-learning-based solution of parametric partial differential equations. Here, approximation theory for fully-connected neur...
Preprint
Full-text available
We describe the new field of mathematical analysis of deep learning. This field emerged around a list of research questions that were not answered within the classical framework of learning theory. These questions concern: the outstanding generalization power of overparametrized neural networks, the role of depth in deep architectures, the apparent...
Article
Full-text available
We demonstrate that deep neural networks with the ReLU activation function can efficiently approximate the solutions of various types of parametric linear transport equations. For non-smooth initial conditions, the solutions of these PDEs are high-dimensional and non-smooth. Therefore, approximation of these functions suffers from a curse of dimens...
Preprint
Full-text available
We prove bounds for the approximation and estimation of certain classification functions using ReLU neural networks. Our estimation bounds provide a priori performance guarantees for empirical risk minimization using networks of a suitable size, depending on the number of training samples available. The obtained approximation and estimation rates a...
Article
We introduce two shearlet-based Ginzburg–Landau energies, based on the continuous and the discrete shearlet transform. The energies result from replacing the elastic energy term of a classical Ginzburg–Landau energy by the weighted L2-norm of a shearlet transform. The asymptotic behaviour of sequences of these energies is analysed within the framew...
Article
Full-text available
We analyze the topological properties of the set of functions that can be implemented by neural networks of a fixed size. Surprisingly, this set has many undesirable properties. It is highly non-convex, except possibly for a few exotic activation functions. Moreover, the set is not closed with respect to \(L^p\)-norms, \(0< p < \infty \), for all p...
Article
Full-text available
In this paper we provide a construction of multiscale systems on a bounded domain $\Omega \subset \mathbb{R}^2$ coined boundary shearlet systems, which satisfy several properties advantageous for applications to imaging science and numerical analysis of partial differential equations. More precisely, we construct boundary shearlet systems that form...
Preprint
Full-text available
We demonstrate that deep neural networks with the ReLU activation function can efficiently approximate the solutions of various types of parametric linear transport equations. For non-smooth initial conditions, the solutions of these PDEs are high-dimensional and non-smooth. Therefore, approximation of these functions suffers from a curse of dimens...
Article
Approximation rate bounds for emulations of real-valued functions on intervals by deep neural networks (DNNs) are established. The approximation results are given for DNNs based on ReLU activation functions. The approximation error is measured with respect to Sobolev norms. It is shown that ReLU DNNs allow for essentially the same approximation rat...
Article
We analyze to what extent deep Rectified Linear Unit (ReLU) neural networks can efficiently approximate Sobolev regular functions if the approximation error is measured with respect to weaker Sobolev norms. In this context, we first establish upper approximation bounds by ReLU neural networks for Sobolev regular functions by explicitly constructing...
Article
Full-text available
In this paper, we study a newly developed shearlet system on bounded domains which yields frames for $H^s(\Omega)$ for some $s\in \mathbb{N}$, $\Omega \subset \mathbb{R}^2$. We will derive approximation rates with respect to $H^s(\Omega)$ norms for functions whose derivatives admit smooth jumps along curves and demonstrate superior rates to those p...
Preprint
We discuss the expressive power of neural networks which use the non-smooth ReLU activation function $\varrho(x) = \max\{0,x\}$ by analyzing the approximation theoretic properties of such networks. The existing results mainly fall into two categories: approximation using ReLU networks with a fixed depth, or using ReLU networks whose depth increases...
Preprint
Full-text available
We analyze to what extent deep ReLU neural networks can efficiently approximate Sobolev regular functions if the approximation error is measured with respect to weaker Sobolev norms. In this context, we first establish upper approximation bounds by ReLU neural networks for Sobolev regular functions by explicitly constructing the approximating ReLU...
Preprint
Full-text available
We derive upper bounds on the complexity of ReLU neural networks approximating the solution maps of parametric partial differential equations. In particular, without any knowledge of its concrete shape, we use the inherent low-dimensionality of the solution manifold to obtain approximation rates which are significantly superior to those provided by...
Preprint
We analyze approximation rates of deep ReLU neural networks for Sobolev-regular functions with respect to weaker Sobolev norms. First, we construct, based on a calculus of ReLU networks, artificial neural networks with ReLU activation functions that achieve certain approximation rates. Second, we establish lower bounds for the approximation by ReLU...
Preprint
Full-text available
We present a novel technique based on deep learning and set theory which yields exceptional classification and prediction results. Having access to a sufficiently large amount of labelled training data, our methodology is capable of predicting the labels of the test data almost always even if the training data is entirely unrelated to the test data...
Technical Report
Approximation rate bounds for expressions of real-valued functions on intervals by deep neural networks (DNNs for short) are established. The approximation results are given for DNNs based on ReLU activation functions, and the approximation error is measured with respect to Sobolev norms. It is shown that ReLU DNNs allow for essentially the same ap...
Preprint
Full-text available
Microlocal analysis provides deep insight into singularity structures and is often crucial for solving inverse problems, predominately, in imaging sciences. Of particular importance is the analysis of wavefront sets and the correct extraction of those. In this paper, we introduce the first algorithmic approach to extract the wavefront set of images...
Preprint
Microlocal analysis provides deep insight into singularity structures and is often crucial for solving inverse problems, predominately, in imaging sciences. Of particular importance is the analysis of wavefront sets and the correct extraction of those. In this paper, we introduce the first algorithmic approach to extract the wavefront set of images...
Book
The chapters in this volume highlight the state-of-the-art of compressed sensing and are based on talks given at the third international MATHEON conference on the same topic, held from December 4-8, 2017 at the Technical University in Berlin. In addition to methods in compressed sensing, chapters provide insights into cutting edge applications of d...
Preprint
Full-text available
We introduce two shearlet-based Ginzburg--Landau energies, based on the continuous and the discrete shearlet transform. The energies result from replacing the elastic energy term of a classical Ginzburg--Landau energy by the weighted $L^2$-norm of a shearlet transform. The asymptotic behaviour of sequences of these energies is analysed within the f...
Preprint
Full-text available
Convolutional neural networks are the most widely used type of neural networks in applications. In mathematical analysis, however, mostly fully-connected networks are studied. In this paper, we establish a connection between both network architectures. Using this connection, we show that all upper and lower bounds concerning approximation rates of...
Preprint
Full-text available
We analyze the topological properties of the set of functions that can be implemented by neural networks of a fixed size. Surprisingly, this set has many undesirable properties: It is highly non-convex, except possibly for a few exotic activation functions. Moreover, the set is not closed with respect to $L^p$-norms, $0<p<\infty$, for all practical...
Preprint
In this paper, we study a newly developed shearlet system on bounded domains which yields frames for $H^s(\Omega)$ for some $s\in \mathbb{N}$, $\Omega \subset \mathbb{R}^2$. We will derive approximation rates with respect to $H^s(\Omega)$ norms for functions whose derivatives admit smooth jumps along curves and demonstrate superior rates to those p...
Preprint
We study the necessary and sufficient complexity of ReLU neural networks---in terms of depth and number of weights---which is required for approximating classifier functions in $L^2$. As a model class, we consider the set $\mathcal{E}^\beta (\mathbb R^d)$ of possibly discontinuous piecewise $C^\beta$ functions $f : [-1/2, 1/2]^d \to \mathbb R$, whe...
Conference Paper
Full-text available
We summarize the main results of a recent theory—developed by the authors—establishing fundamental lower bounds on the connectivity and memory requirements of deep neural networks as a function of the complexity of the function class to be approximated by the network. These bounds are shown to be achievable. Specifically, all function classes that...
Article
Regularization techniques for the numerical solution of inverse scattering problems in two space dimensions are discussed. Assuming that the boundary of a scatterer is its most prominent feature, we exploit as model the class of cartoon-like functions. Since functions in this class are asymptotically optimally sparsely approximated by shearlet fram...
Article
We introduce bendlets, a shearlet-like system that is based on anisotropic scaling, translation, shearing, and bending of a compactly supported generator. With shearing being linear and bending quadratic in spatial coordinates, bendlets provide what we term a second-order shearlet system. As we show in this article, the decay rates of the associate...
Article
Full-text available
We derive fundamental lower bounds on the connectivity and the memory requirements of deep neural networks guaranteeing uniform approximation rates for arbitrary function classes in $L^2(\R^d)$. In other words, we establish a connection between the complexity of a function class and the complexity of deep neural networks approximating functions fro...
Preprint
We derive fundamental lower bounds on the connectivity and the memory requirements of deep neural networks guaranteeing uniform approximation rates for arbitrary function classes in $L^2(\mathbb R^d)$. In other words, we establish a connection between the complexity of a function class and the complexity of deep neural networks approximating functi...
Article
We analyze the detection and classification of singularities of functions $f = \chi_B$, where $B \subset \mathbb{R}^d$ and $d = 2,3$. It will be shown how the set $\partial B$ can be extracted by a continuous shearlet transform associated with compactly supported shearlets. Furthermore, if $\partial S$ is a $d-1$ dimensional piecewise smooth manifo...
Article
We introduce bendlets, a shearlet-like system that is based on anisotropic scaling, translation, shearing, and bending of a compactly supported generator. With shearing being linear and bending quadratic in spatial coordinates, bendlets provide what we term a second-order shearlet system. As we show in this article, the decay rates of the associate...
Preprint
We introduce bendlets, a shearlet-like system that is based on anisotropic scaling, translation, shearing, and bending of a compactly supported generator. With shearing being linear and bending quadratic in spatial coordinates, bendlets provide what we term a second-order shearlet system. As we show in this article, the decay rates of the associate...
Thesis
In this thesis we discuss and extend the theory of shearlet systems. These systems were introduced by Guo, Kutyniok, Labate, Lim and Weiss, and have found a multitude of applications in signal- and image processing and related fields since then. The results of this thesis are split into two different but connected parts. In the first part we presen...
Article
We demonstrate that shearlet systems yield superior $N$-term approximation rates compared with wavelet systems of functions whose first or higher order derivatives are piecewise smooth away from smooth discontinuity curves. We will also provide an improved estimate for the decay of shearlet coefficients that intersect a discontinuity curve non-tang...
Article
Linear indepencence of finite subsets of a frame is closely related to the recently proven Kadison Singer Conjecture. However, for Gabor frames it is still an open question whether every finite subset is linearly independent. In this paper we consider shearlet systems and show that seperabel compactly supported shearlet systems exhibit indeed the l...

Network

Cited By