# Lars RuthottoEmory University | EU · Departments of Mathematics and Computer Science

Lars Ruthotto

Dr. rer. nat.

## About

103

Publications

27,301

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

2,581

Citations

Introduction

I am an applied mathematician developing computational methods for deep learning and inverse problems. In deep learning, I seek to create new insights and efficient training for continuous models based on ordinary and partial differential equations and use those models to solve high-dimensional optimal control problems. In inverse problems, I have been focussing on applications in image registration and reconstruction.

Additional affiliations

September 2014 - present

January 2013 - August 2014

October 2010 - December 2012

Education

March 2010 - November 2012

## Publications

Publications (103)

Diffusion weighted magnetic resonance imaging is a key investigation technique in modern neuroscience. In clinical settings, diffusion weighted imaging (DWI) and its extension to diffusion tensor imaging (DTI) is usually performed applying the technique of echo-planar imaging (EPI). EPI is the commonly available ultrafast acquisition technique for...

Image registration is one of the most challenging problems in image processing, where ill-posedness arises due to noisy data as well as non-uniqueness and hence the choice of regularization is crucial. This paper presents hyperelasticity as a regularizer and introduces a new and stable numerical implementation. On one hand, hyperelastic registratio...

Estimating parameters of Partial Differential Equations (PDEs) from noisy and indirect measurements requires solutions of ill-posed inverse problems. Such problems arise in a variety of applications such as geophysical, medical imaging, and nondestructive testing. These so called parameter estimation or inverse medium problems, are computationally...

Deep neural networks have become invaluable tools for supervised machine learning, e.g., in classification of text or images. While offering superior flexibility to find and express complicated patterns in data, deep architectures are known to be challenging to design and train so that they generalize well to new data. An important issue are numeri...

Mean field games (MFG) and mean field control (MFC) are critical classes of multi-agent models for efficient analysis of massive populations of interacting agents. Their areas of application span topics in economics, finance, game theory, industrial engineering, crowd motion, and more. In this paper, we provide a flexible machine learning framework...

Diffusion MRI (dMRI) has become a crucial imaging technique within the field of neuroscience and has an increasing number of clinical applications. Although most studies still focus on the brain, there is a growing interest in utilizing dMRI to investigate the healthy or injured spinal cord. The past decade has also seen the development of biophysi...

This paper introduces LSEMINK, an effective modified Newton-Krylov algorithm geared toward minimizing the log-sum-exp function for a linear model. Problems of this kind arise commonly, for example, in geometric programming and multinomial logistic regression. Although the log-sum-exp function is smooth and convex, standard line search Newton-type m...

We propose an alternating minimization heuristic for regression over the space of tropical rational functions with fixed exponents. The method alternates between fitting the numerator and denominator terms via tropical polynomial regression, which is known to admit a closed form solution. We demonstrate the behavior of the alternating minimization...

Score-based diffusion models (SBDM) have recently emerged as state-of-the-art approaches for image generation. Existing SBDMs are typically formulated in a finite-dimensional setting, where images are considered as tensors of a finite size. This papers develops SBDMs in the infinite-dimensional setting, that is, we model the training data as functi...

Graph Neural Networks (GNNs) are limited in their propagation operators. These operators often contain non-negative elements only and are shared across channels and layers, limiting the expressiveness of GNNs. Moreover, some GNNs suffer from over-smoothing, limiting their depth. On the other hand, Convolutional Neural Networks (CNNs) can learn dive...

We present a neural network approach for approximating the value function of high-dimensional stochastic control problems. Our training process simultaneously updates our value function estimate and identifies the part of the state space likely to be visited by optimal trajectories. Our approach leverages insights from optimal control theory and th...

We propose Multivariate Quantile Function Forecaster (MQF$^2$), a global probabilistic forecasting method constructed using a multivariate quantile function and investigate its application to multi-horizon forecasting. Prior approaches are either autoregressive, implicitly capturing the dependency structure across time but exhibiting error accumula...

We propose a neural network (NN) approach that yields approximate solutions for high-dimensional optimal control (OC) problems and demonstrate its effectiveness using examples from multiagent path finding. Our approach yields control in a feedback form, where the policy function is given by an NN. In particular, we fuse the Hamilton-Jacobi-Bellman...

Deep neural networks (DNNs) have shown their success as high-dimensional function approximators in many applications; however, training DNNs can be challenging in general. DNN training is commonly phrased as a stochastic optimization problem whose challenges include non-convexity, non-smoothness, insufficient regularization, and complicated data di...

We present a convection–diffusion inverse problem that aims to identify an unknown number of sources and their locations. We model the sources using a binary function, and we show that the inverse problem can be formulated as a large-scale mixed-integer nonlinear optimization problem. We show empirically that current state-of-the-art mixed-integer...

A normalizing flow is an invertible mapping between an arbitrary probability distribution and a standard normal distribution; it can be used for density estimation and statistical inference. Computing the flow follows the change of variables formula and thus requires invertibility of the mapping and an efficient way to compute the determinant of it...

Deep generative models (DGM) are neural networks with many hidden layers trained to approximate complicated, high‐dimensional probability distributions using samples. When trained successfully, we can use the DGM to estimate the likelihood of each observation and to create new samples from the underlying distribution. Developing DGMs has become one...

We propose a neural network approach for solving high-dimensional optimal control problems arising in real-time applications. Our approach yields controls in a feedback form and can therefore handle uncertainties such as perturbations to the system's state. We accomplish this by fusing the Pontryagin Maximum Principle (PMP) and Hamilton-Jacobi-Bell...

Deep generative models (DGM) are neural networks with many hidden layers trained to approximate complicated, high-dimensional probability distributions using a large number of samples. When trained successfully, we can use the DGMs to estimate the likelihood of each observation and to create new samples from the underlying distribution. Developing...

We demonstrate the ability of hybrid regularization methods to automatically avoid the double descent phenomenon arising in the training of random feature models (RFM). The hallmark feature of the double descent phenomenon is a spike in the regularization gap at the interpolation threshold, i.e. when the number of features in the RFM equals the num...

We present a multigrid-in-channels (MGIC) approach that tackles the quadratic growth of the number of parameters with respect to the number of channels in standard convolutional neural networks (CNNs). It has been shown that there is a redundancy in standard CNNs, as networks with light or sparse convolution operators yield similar performance to f...

We propose a neural network approach for solving high-dimensional optimal control problems. In particular, we focus on multi-agent control problems with obstacle and collision avoidance. These problems immediately become high-dimensional, even for moderate phase-space dimensions per agent. Our approach fuses the Pontryagin's Maximum Principle and H...

Deep neural networks (DNNs) have achieved state-of-the-art performance across a variety of traditional machine learning tasks, e.g., speech recognition, image classification, and segmentation. The ability of DNNs to efficiently approximate high-dimensional functions has also motivated their use in scientific applications, e.g., to solve partial dif...

We present a multigrid approach that combats the quadratic growth of the number of parameters with respect to the number of channels in standard convolutional neural networks (CNNs). It has been shown that there is a redundancy in standard CNNs, as networks with much sparser convolution operators can yield similar performance to full networks. The...

A normalizing flow is an invertible mapping between an arbitrary probability distribution and a standard normal distribution; it can be used for density estimation and statistical inference. Computing the flow follows the change of variables formula and thus requires invertibility of the mapping and an efficient way to compute the determinant of it...

We compare the discretize-optimize (Disc-Opt) and optimize-discretize (Opt-Disc) approaches for time-series regression and continuous normalizing flows using neural ODEs. Neural ODEs, first described in Chen et al. (2018), are ordinary differential equations (ODEs) with neural network components; these models have competitively solved a variety of...

We present PNKH-B, a projected Newton-Krylov method with a low-rank approximated Hessian metric for approximately solving large-scale optimization problems with bound constraints. PNKH-B is geared toward situations in which function and gradient evaluations are expensive, and the (approximate) Hessian is only available through matrix-vector product...

Mean field games (MFG) and mean field control (MFC) are critical classes of multiagent models for the efficient analysis of massive populations of interacting agents. Their areas of application span topics in economics, finance, game theory, industrial engineering, crowd motion, and more. In this paper, we provide a flexible machine learning framew...

Partial differential equations (PDEs) are indispensable for modeling many physical phenomena and also commonly used for solving image processing tasks. In the latter area, PDE-based approaches interpret image data as discretizations of multivariate functions and the output of image processing algorithms as solutions to certain PDEs. Posing image pr...

Convolutional Neural Networks (CNNs) have become indispensable for solving machine learning tasks in speech recognition, computer vision, and other areas that involve high-dimensional data. A CNN filters the input feature using a network containing spatial convolution operators with compactly supported stencils. In practice, the input data and the...

This white paper came out of an exploratory workshop held on November 15-16, 2019 at the Institute for Pure and Applied Mathematics at UCLA. Represented at the workshop were members of the mathematics, machine learning, cryptography, philosophy, social science, legal, and policy communities. Discussion at the workshop focused on the impact of deep...

We present a convection-diffusion inverse problem that aims to identify an unknown number of sources and their locations. We model the sources using a binary function, and we show that the inverse problem can be formulated as a large-scale mixed-integer nonlinear optimization problem. We show empirically that current state-of-the-art mixed-integer...

Convolutional Neural Networks (CNNs) have become indispensable for solving machine learning tasks in speech recognition, computer vision, and other areas that involve high-dimensional data. A CNN filters the input feature using a network containing spatial convolution operators with compactly supported stencils. In practice, the input data and the...

Phase recovery from the bispectrum is a central problem in speckle interferometry which can be posed as an optimization problem minimizing a weighted nonlinear leastsquares objective function. We look at two different formulations of the phase recovery problem from the literature, both of which can be minimized with respect to either the recovered...

We present ADMM-Softmax, an alternating direction method of multipliers (ADMM) for
solving multinomial logistic regression (MLR) problems. Our method is geared toward supervised classification tasks with many examples and features. It decouples the nonlinear optimization problem in MLR into three steps that can be solved efficiently. In particular,...

The hMRI toolbox is an open-source toolbox for the calculation of quantitative MRI parameter maps from a series of weighted imaging data, and optionally additional calibration data. The multi-parameter mapping (MPM) protocol, incorporating calibration data to correct for spatial variation in the scanner's transmit and receive fields, is the most co...

Convolutional Neural Networks (CNNs) filter the input data using a series of spatial convolution operators with compact stencils and point-wise non-linearities. Commonly, the convolution operators couple features from all channels, which leads to immense computational cost in the training of and prediction with CNNs. To improve the efficiency of CN...

Deep convolutional neural networks have revolutionized many machine learning and computer vision tasks. Despite their enormous success, remaining key challenges limit their wider use. Pressing challenges include improving the network's robustness to perturbations of the input images and simplifying the design of architectures that generalize. Anoth...

Neuroscience and clinical researchers are increasingly interested in quantitative magnetic resonance imaging (qMRI) due to its sensitivity to micro-structural properties of brain tissue such as axon, myelin, iron and water concentration. We introduce the hMRI-toolbox, an open-source, easy-to-use tool available on GitHub, for qMRI data handling and...

Phase recovery from the bispectrum is a central problem in speckle interferometry which can be posed as an optimization problem minimizing a weighted nonlinear least-squares objective function. We look at two separate formulations of the phase recovery problem from the literature, both which can be minimized with respect to either the recovered pha...

Residual neural networks (ResNets) are a promising class of deep neural networks that have shown excellent performance for a number of learning tasks, e.g., image classification and recognition. Mathematically, ResNet architectures can be interpreted as forward Euler discretizations of a nonlinear initial value problem whose time-dependent control...

Estimating parameters of Partial Differential Equations (PDEs) is of interest in a number of applications such as geophysical and medical imaging. Parameter estimation is commonly phrased as a PDE-constrained optimization problem that can be solved iteratively using gradient-based optimization. A computational bottleneck in such approaches is that...

Quantitative magnetic resonance imaging (qMRI) finds increasing application in neuro-science and clinical research due to its sensitivity to micro-structural properties of brain tissue, e.g. axon, myelin, iron and water concentration. We introduce the hMRI-toolbox, an easy-to-use open-source tool for handling and processing of qMRI data presented t...

We consider a global variable consensus ADMM algorithm for solving large-scale PDE parameter estimation problems asynchronously and in parallel. To this end, we partition the data and distribute the resulting subproblems among the available workers. Since each subproblem can be associated with different forward models and right-hand-sides, this pro...

In this work, we present a new derivative-free optimization method and investigate its use for training neural networks. Our method is motivated by the Ensemble Kalman Filter (EnKF), which has been used successfully for solving optimization problems that involve large-scale, highly nonlinear dynamical systems. A key benefit of the EnKF method is th...

The main computational cost in the training of and prediction with Convolution Neural Networks (CNNs) typically stems from the convolution. In this paper, we present three novel ways to parameterize the convolution more efficiently, significantly decreasing the computational complexity. Commonly used CNNs filter the input data using a series of spa...

Many inverse problems involve two or more sets of variables that represent different physical quantities but are tightly coupled with each other. For example, image super-resolution requires joint estimation of the image and motion parameters from noisy measurements. Exploiting this structure is key for efficiently solving these large-scale optimiz...

We provide a mathematical formulation and develop a computational framework for identifying multiple strains of microorganisms from mixed samples of DNA. Our method is applicable in public health domains where efficient identification of pathogens is paramount, such as the monitoring of disease outbreaks. We formulate strain identification as an in...

We present an improved technique for susceptibility artifact correction in echo-planar imaging (EPI), a widely used ultra-fast magnetic resonance imaging (MRI) technique. Our method corrects geometric deformations and intensity modulations present in EPI images. We consider a tailored variational image registration problem incorporating a physical...

In this paper, we address the challenging problem of optimal experimental design (OED) of inverse problems with state constraints. We consider two OED formulations that allow us to reduce the experimental costs by minimizing the number of measurements. The first formulation assumes a fine discretization of the design parameter space and uses sparsi...

We present novel numerical methods for Polyline-to-Point-Cloud Registration and their application to patient-specific modeling of deployed coronary artery stents from image data. Patient-specific coronary stent reconstruction is an important challenge in computational hemodynamics and relevant to the design and improvement of the prostheses. It is...

Attempts for in-vivo histology require a high spatial resolution that comes with the price of a decreased signal-to-noise ratio. We present a novel iterative and multi-scale smoothing method for quantitative Magnetic Resonance Imaging (MRI) data that yield proton density, apparent transverse and longitudinal relaxation, and magnetization transfer m...

Recently, deep residual networks have been successfully applied in many computer vision and natural language processing tasks, pushing the state-of-the-art performance with deeper and wider architectures. In this work, we interpret deep residual networks as ordinary differential equations (ODEs), which have long been studied in mathematics and phys...

In this paper, we address the challenging problem of optimal experimental design (OED) of constrained inverse problems. We consider two OED formulations that allow reducing the experimental costs by minimizing the number of measurements. The first formulation assumes a fine discretization of the design parameter space and uses sparsity promoting re...

Image registration is a central problem in a variety of areas involving imaging techniques and is known to be challenging and ill-posed. Regularization functionals based on hyperelasticity provide a powerful mechanism for limiting the ill-posedness. A key feature of hyperelastic image registration approaches is their ability to model large deformat...

We present an efficient solver for diffeomorphic image registration problems in the framework of Large Deformations Diffeomorphic Metric Mappings (LDDMM). We use an optimal control formulation, in which the velocity field of a hyperbolic PDE needs to be found such that the distance between the final state of the system (the transformed/transported...

In this work we explore the connection between Convolution Neural Networks, partial differential equations, multigrid/multiscale methods and and optimal control. We show that convolution neural networks can be represented as a discretization of nonlinear partial differential equations, and that the learning process can be interpreted as a control p...

We present two efficient numerical methods for susceptibility artifact correction in Echo Planar Imaging (EPI), an ultra fast Magnetic Resonance Imaging (MRI) technique widely used in clinical applications. Our methods address a major practical drawback of EPI, the so-called susceptibility artifacts, which consist of geometrical transformations and...

In contrast to classical T1, T2, or PD-weighted imaging which acquires intensity values in arbitrary units, quantitative imaging (qMRI) has the clear advantage of providing absolute values comparable across sites and time. Multi-Parameter Mapping (MPM; Weiskopf, 2013; Lutti, 2014) is a framework for qMRI that simultaneously measures the proton dens...

Reconstructing images from indirect measurements is a central problem in many applications, including the subject of this special issue, quantitative susceptibility mapping (QSM). The process of image reconstruction typically requires solving an inverse problem that is ill-posed and large-scale and thus challenging to solve. Although the research f...