Matthew Knepley

Matthew Knepley
University at Buffalo, The State University of New York | SUNY Buffalo · Department of Computer Science and Engineering

PhD

About

169
Publications
32,300
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,919
Citations
Citations since 2016
58 Research Items
1578 Citations
2016201720182019202020212022050100150200250300
2016201720182019202020212022050100150200250300
2016201720182019202020212022050100150200250300
2016201720182019202020212022050100150200250300
Additional affiliations
August 2017 - present
University at Buffalo, The State University of New York
Position
  • Professor (Associate)
June 2015 - August 2017
Rice University
Position
  • Professor (Assistant)
May 2010 - May 2012
Monash University (Australia)
Position
  • Adjunct Senior Research Fellow
Education
January 1997 - December 2000
Purdue University
Field of study
  • Computer Science
September 1995 - December 1996
University of Minnesota Twin Cities
Field of study
  • Computer Science
September 1994 - May 1995
University of Chicago
Field of study
  • Physics

Publications

Publications (169)
Preprint
Full-text available
Two important classes of three-dimensional elements in computational meshes are hexahedra and tetrahedra. While several efficient methods exist that convert a hexahedral element to a tetrahedral elements, the existing algorithm for tetrahedralization of a hexahedral complex is the marching tetrahedron algorithm which limits pre-selection of face di...
Preprint
Full-text available
Particle-in-Cell (PIC) methods employ particle representations of unknown fields, but also employ continuum fields for other parts of the problem. Thus projection between particle and continuum bases is required. Moreover, we often need to enforce conservation constraints on this projection. We derive a mechanism for enforcement based on weak equal...
Article
Droplet formation happens in finite time due to the surface tension force. The linear stability analysis is useful to estimate droplet size but fails to approximate droplet shape. This is due to a highly non-linear flow description near the point where the first pinch-off happens. A one-dimensional axisymmetric mathematical model was first develope...
Preprint
Full-text available
Stencil composition uses the idea of function composition, wherein two stencils with arbitrary orders of derivative are composed to obtain a stencil with a derivative order equal to sum of the orders of the composing stencils. In this paper, we show how stencil composition can be applied to form finite difference stencils in order to numerically so...
Preprint
Full-text available
Finite element analysis of solid mechanics is a foundational tool of modern engineering, with low-order finite element methods and assembled sparse matrices representing the industry standard for implicit analysis. We use performance models and numerical experiments to demonstrate that high-order methods greatly reduce the costs to reach engineerin...
Preprint
Full-text available
This research note documents the integration of the MPI-parallel metric-based mesh adaptation toolkit ParMmg into the solver library PETSc. This coupling brings robust, scalable anisotropic mesh adaptation to a wide community of PETSc users, as well as users of downstream packages. We demonstrate the new functionality via the solution of Poisson pr...
Preprint
Full-text available
The communities who develop and support open source scientific software packages are crucial to the utility and success of such packages. Moreover, these communities form an important part of the human infrastructure that enables scientific progress. This paper discusses aspects of the PETSc (Portable Extensible Toolkit for Scientific Computation)...
Article
Full-text available
The communities that develop and support open-source scientific software packages are crucial to the utility and success of such packages. Moreover, they form an important part of the human infrastructure that enables scientific progress. This article discusses aspects of the Portable Extensible Toolkit for Scientific Computation community, its org...
Article
Full-text available
The Portable Extensible Toolkit for Scientific computation (PETSc) library delivers scalable solvers for nonlinear time-dependent differential and algebraic equations and for numerical optimization. The PETSc design for performance portability addresses fundamental GPU accelerator challenges and stresses flexibility and extensibility by separating...
Article
Full-text available
We lay out the ramifications of the 2020 pandemic for all people in geosciences, especially the young, and argue for significant changes on training and career development. We focus primarily on its devastating impact in U.S.A. and compare with that in other countries especially China. We review the potential effect for the next four years or so on...
Article
We lay out the ramifications of the 2020 pandemic for all people in geosciences, especially the young, and argue for significant changes on training and career development. We focus primarily on its devastating impact in U.S.A. and compare with that in other countries especially China. We review the potential effect for the next four years or so on...
Article
Full-text available
Effective relaxation methods are necessary for good multigrid convergence. For many equations, standard Jacobi and Gauß–Seidel are inadequate, and more sophisticated space decompositions are required; examples include problems with semidefinite terms or saddle point structure. In this article, we present a unifying software abstraction, PCPATCH, fo...
Article
Artificial intelligence combined with high-performance computing could trigger a fundamental change in how geoscientists extract knowledge from large volumes of data.
Article
PetscSF, the communication component of the Portable, Extensible Toolkit for Scientific Computation (PETSc), is designed to provide PETScs communication infrastructure suitable for exascale computers that utilize GPUs and other accelerators. PetscSF provides a simple application programming interface (API) for managing common communication patterns...
Preprint
Full-text available
The Landau form of the Fokker-Planck equation is the gold standard for plasmas dominated by small angle collisions, however its $\Order{N^2}$ work complexity has limited its practicality. This paper extends previous work on a fully conservative finite element method for this Landau collision operator with adaptive mesh refinement, optimized for vec...
Article
The global total- $f$ gyrokinetic particle-in-cell code XGC , used to study transport in magnetic fusion plasmas or to couple with a core gyrokinetic code while functioning as an edge gyrokinetic code, implements a five-dimensional continuum grid to perform the dissipative operations, such as plasma collisions, or to exchange the particle distribut...
Preprint
Full-text available
In this work, we collect data from runs of Krylov subspace methods and pipelined Krylov algorithms in an effort to understand and model the impact of machine noise and other sources of variability on performance. We find large variability of Krylov iterations between compute nodes for standard methods that is reduced in pipelined algorithms, direct...
Article
Full-text available
Large-scale PDE simulations using high-order finite-element methods on unstructured meshes are an indispensable tool in science and engineering. The widely used open-source PETSc library offers an efficient representation of generic unstructured meshes within its DMPlex module. This paper details our recent implementation of parallel mesh reading a...
Preprint
Full-text available
PetscSF, the communication component of the Portable, Extensible Toolkit for Scientific Computation (PETSc), is being used to gradually replace the direct MPI calls in the PETSc library. PetscSF provides a simple application programming interface (API) for managing common communication patterns in scientific computations by using a star-forest grap...
Preprint
Full-text available
Since 2015 much has developed in geodynamical modeling because of the arrival of Big Data. We present here an overview of numerical techniques but also a scan of the new opportunities in this age of Big Data and prepare the community for the coming decade, the roaring twenties, when Data Analytics will reign. We begin with a review of traditional n...
Preprint
Full-text available
The global total-$f$ gyrokinetic particle-in-cell code XGC, used to study transport in magnetic fusion plasmas, implements a continuum grid to perform the dissipative operations, such as plasma collisions. To transfer the distribution function between marker particles and a rectangular velocity-space grid, XGC employs a bilinear mapping. The conser...
Preprint
Full-text available
The Portable Extensible Toolkit for Scientific computation (PETSc) library delivers scalable solvers for nonlinear time-dependent differential and algebraic equations and for numerical optimization.The PETSc design for performance portability addresses fundamental GPU accelerator challenges and stresses flexibility and extensibility by separating t...
Article
Full-text available
In this work, we collect data from runs of Krylov subspace methods and pipelined Krylov algorithms in an effort to understand and model the impact of machine noise and other sources of variability on performance. We find large variability of Krylov iterations between compute nodes for standard methods that is reduced in pipelined algorithms, direct...
Preprint
Full-text available
Large-scale PDE simulations using high-order finite-element methods on unstructured meshes are an indispensable tool in science and engineering. The widely used open-source PETSc library offers an efficient representation of generic unstructured meshes within its DMPlex module. This paper details our recent implementation of parallel mesh reading a...
Article
Full-text available
An hybridizable discontinuous Galerkin method of arbitrary high order is formulated to solve the miscible displacement problem in porous media. The spatial discretization is combined with a sequential algorithm that decouples the flow and the transport equations. Hybridization produces a linear system for the globally coupled degrees of freedom, th...
Chapter
Full-text available
Since 2015 much has developed in geodynamical modeling because of the arrival of Big Data. We present in two parts an overview of numerical techniques but also a scan of the new opportunities in this age of Big Data and prepare the community for the coming decade, the roaring twenties, when Data Analytics will reign. We begin with a review of tradi...
Chapter
Full-text available
Since 2015 much has developed in geodynamical modeling because of the arrival of Big Data. We present in two parts an overview of numerical techniques but also a scan of the new opportunities in this age of Big Data and prepare the community for the coming decade, the roaring twenties, when Data Analytics will reign. We begin with a review of tradi...
Preprint
Full-text available
Effective relaxation methods are necessary for good multigrid convergence. For many equations, standard Jacobi and Gau{\ss}-Seidel are inadequate, and more sophisticated space decompositions are required; examples include problems with semidefinite terms or saddle point structure. In this paper we present a unifying software abstraction, PCPATCH, f...
Article
The focus of this paper is the analysis of families of hybridizable interior penalty discontinuous Galerkin methods for second order elliptic problems. We derive a priori error estimates in the energy norm that are optimal with respect to the mesh size. Suboptimal L² norm error estimates are proven. These results are valid in two and three dimensio...
Article
The objective of this paper is twofold. First, we propose two composable block solver methodologies to solve the discrete systems that arise from finite element discretizations of the double porosity/permeability (DPP) model. The DPP model, which is a four-field mathematical model, describes the flow of a single-phase incompressible fluid in a poro...
Article
Full-text available
In this paper, we present a series of mathematical abstractions for seismologically relevant wave equations discretized using finite-element methods, and demonstrate how these abstractions can be implemented efficiently in computer code. Our motivation is to mitigate the combinatorial complexity present when considering geophysical waveform modelli...
Article
Edema, also termed oedema, is a generalized medical condition associated with an abnormal aggregation of fluid in a tissue matrix. In the intestine, excessive edema can lead to serious health complications associated with reduced motility. A $7.5\%$ solution of hypertonic saline (HS) has been hypothesized as an effective means to reduce the effects...
Article
We present a parallel computing strategy for a hybridizable discontinuous Galerkin (HDG) nested geometric multigrid (GMG) solver. Parallel GMG solvers require a combination of coarse-grain and fine-grain parallelism to improve time-to-solution performance. In this work we focus on fine-grain parallelism. We use Intel's second generation Xeon Phi (K...
Article
Full-text available
We demonstrate that the recently developed solvation-layer interface condition (SLIC) continuum dielectric model for molecular electrostatics, when combined with a simple solvent-accessible-surface-area (SASA)-proportional model for nonpolar solvent effects, is able to predict solvation entropies of neutral and charged small molecules with high acc...
Preprint
We present a new method for approximating solutions to the incompressible miscible displacement problem in porous media. At the discrete level, the coupled nonlinear system has been split into two linear systems that are solved sequentially. The method is based on a hybridizable discontinuous Galerkin method for the Darcy flow, which produces a mas...
Preprint
Full-text available
The objective of this paper is twofold. First, we propose two composable block solver methodologies to solve the discrete systems that arise from finite element discretizations of the double porosity/permeability (DPP) model. The DPP model, which is a four-field mathematical model, describes the flow of a single-phase incompressible fluid in a poro...
Article
Full-text available
We present a performance analysis appropriate for comparing algorithms using different numerical discretizations. By taking into account the total time-to-solution, numerical accuracy with respect to an error norm, and the computation rate, a cost-benefit analysis can be performed to determine which algorithm and discretization are particularly sui...
Article
Full-text available
We present a new method for simulating incompressible immiscible two-phase flow in porous media. The semi-implicit method decouples the wetting phase pressure and saturation equations. The equations are discretized using a hybridizable discontinuous Galerkin (HDG) method. The proposed method is of high order, conserves global/local mass balance, an...
Article
Full-text available
Kolmogorov famously proved that multivariate continuous functions can be represented as a superposition of a small number of univariate continuous functions, $$ f(x_1,\dots,x_n) = \sum_{q=0}^{2n+1} \chi^q \left( \sum_{p=1}^n \psi^{pq}(x_p) \right).$$ Fridman \cite{fridman} posed the best smoothness bound for the functions $\psi^{pq}$, that such fun...
Article
Full-text available
Ellipsoidal harmonics are a useful generalization of spherical harmonics but present additional numerical challenges. One such challenge is in computing ellipsoidal normalization constants which require approximating a singular integral. In this paper, we present results for approximating normalization constants using a well-known decomposition and...
Article
Full-text available
We present a heterogeneous computing strategy for a hybridizable discontinuous Galerkin (HDG) geometric multigrid (GMG) solver. Parallel GMG solvers require a combination of coarse grain and fine grain parallelism is utilized to improve time to solution performance. In this work we focus on fine grain parallelism. We use Intel's second generation X...
Article
Full-text available
Important computational physics problems are often large-scale in nature, and it is highly desirable to have robust and high performing computational frameworks that can quickly address these problems. However, it is no trivial task to determine whether a computational framework is performing efficiently or is scalable. The aim of this paper is to...
Article
We extend the linearized Poisson-Boltzmann (LPB) continuum electrostatic model for molecular solvation to address charge-hydration asymmetry. Our new solvation-layer interface condition (SLIC)/LPB corrects for first-shell response by perturbing the traditional continuum-theory interface conditions at the protein-solvent and the Stern-layer interfac...
Article
Full-text available
The Landau collision integral is an accurate model for the small-angle dominated Coulomb collisions in fusion plasmas. We investigate a high order accurate, fully conservative, finite element discretization of the nonlinear multi-species Landau integral with adaptive mesh refinement using the PETSc library (www.mcs.anl.gov/petsc). We develop algori...
Article
Full-text available
The immersed boundary (IB) method is a widely used approach to simulating fluid-structure interaction (FSI). Although explicit versions of the IB method can suffer from severe time step size restrictions, these methods remain popular because of their simplicity and generality. In prior work (Guy et al., Adv Comput Math, 2015), some of us developed...
Article
Full-text available
We demonstrate that with two small modifications, the popular dielectric continuum model is capable of predicting, with high accuracy, ion solvation thermodynamics in numerous polar solvents, and ion solvation free energies in water--co-solvent mixtures. The first modification involves perturbing the macroscopic dielectric-flux interface condition...
Conference Paper
Full-text available
Despite decades of research in this area, mesh adaptation capabilities are still rarely found in numerical simulation software. We postulate that the primary reason for this is lack of usability. Integrating mesh adaptation into existing software is difficult as non-trivial operators, such as error metrics and interpolation operators, are required,...
Article
Full-text available
We present a novel, quadrature-based finite element integration method for low-order elements on GPUs, using a pattern we call \textit{thread transposition} to avoid reductions while vectorizing aggressively. On the NVIDIA GTX580, which has a nominal single precision peak flop rate of 1.5 TF/s and a memory bandwidth of 192 GB/s, we achieve close to...
Article
In this paper, we extend the familiar continuum electrostatic model to incorporate finite-size effects in the solvation layer, by perturbing the usual macroscopic interface condition. The perturbation is based on the mean spherical approximation (MSA), to derive a multiscale solvation-layer interface condition (SLIC/MSA). We show that SLIC/MSA repr...
Conference Paper
Full-text available
Elliptic partial differential equations (PDEs) frequently arise in continuum descriptions of physical processes relevant to science and engineering. Multilevel preconditioners represent a family of scalable techniques for solving discrete PDEs of this type and thus are the method of choice for high-resolution simulations. The scalability and time-t...
Article
Pipelined Krylov methods seek to ameliorate the latency due to inner products necessary for projection by overlapping it with the computation associated with sparse matrix-vector multiplication. We clarify a folk theorem that this can only result in a speedup of 2× over the naive implementation. Examining many repeated runs, we show that stochastic...
Article
Traditional Poisson-Boltzmann models of molecular electrostatics fail to model the behavior of the hydration-shell solvent, which deviates significantly from bulk. Here we present a hydration-shell Poisson-Boltzmann (HSPB) model that accounts for these deviations by perturbing the displacement-field boundary condition. The HSPB model correctly pred...
Code
PyLith is an open-source finite-element code for dynamic and quasistatic simulations of crustal deformation, primarily earthquakes and volcanoes. • Main page: [https://geodynamics.org/cig/software/pylith](https://geodynamics.org/cig/software/pylith) • User Manual • Binary packages • Utility to build PyLith and all of its dependencies from source...
Article
Full-text available
Electrostatic forces play many important roles in molecular biology, but are hard to model due to the complicated interactions between biomolecules and the surrounding solvent, a fluid composed of water and dissolved ions. Continuum model have been surprisingly successful for simple biological questions, but fail for important problems such as unde...